General Disclaimer 


One or more of the Following Statements may affect this Document 


• This document has been reproduced from the best copy furnished by the 
organizational source. It is being released in the interest of making available as 
much information as possible. 


• This document may contain data, which exceeds the sheet parameters. It was 
furnished in this condition by the organizational source and is the best copy 
available. 


• This document may contain tone-on-tone or color graphs, charts and/or pictures, 
which have been reproduced in black and white. 


• This document is paginated as submitted by the original source. 


• Portions of this document are not fully legible due to the historical nature of some 
of the material. However, it is the best reproduction available from the original 
submission. 


Produced by the NASA Center for Aerospace Information (CASI) 




Made available under NASA sponsorship 
in the interest of early and wide dis- 
semination of Earth Resources Survey 
Program information and without liability 
for any use made thereof." 


CROP IDENTIFICATION AND ACREAGE MEASUREMENT 
UTILIZING LANDSAT IMAGERY 


Unclas 

G3/43 00336 


Statistical Reporting Service 
United States Department of Agriculture 
Washington, D.C. 20250 


MM 197 p 

RECEIVED 

NASA STt FACiUTt 

input branch 


MARCH 1976 

/o/i>h 

received 

APR 27 1976 
SIS 1 902.6 


CROP IDENTIFICATION AND ACREAGE 
MEASUREMENT UTILIZING LANDSAT IMAGERY 


Donald H. Von Steen and William H. Wigton 


Statistical Reporting Service 
United States Department of Agriculture 
Washington, D.C. 20250 


March 1976 


Original photography may be purchased Irom: 
EROS Data Center 
10th and Dakota Avenue 

Sioux Falls, SD 57198 


Prepared for: Goddard Space Flight Center 

Greenbelt , Maryland 20771 

01O13A, S-70251AC3 

Technical Monitor: Frederick Gordon 


TECHNICAL REPORT STANDARD TITLE PACE 


1. Report No. 

4. Title and Subtitle 


2. Government Accession No. 


3. Recipient's Catalog No. 
6 . Report Date 


CROP IDENTIFICATION AND ACREAGE MEASUREMENT 
UTILIZING LANDSAT (FORMERLY ERTS) IMAGERY 

7. Author(s) 

Donald H. .Von Steen and William 11. Wigton 


6. Performing Organization Code 


8. Performing Organization Report No. 


9. Performing Organization Name and Address 

Statistical Reporting Service 

United States Department of Agriculture 

Washington, D. C. 20250 


10. Work Unit No. 


11. Contract or Grant No. 

#1013A, S-70251AG3 


12. Sponsoring Agency Name and Address 

Goddard Space Flight Center 
Greenbelt, Maryland 20771 


13. Type of Report and Period Covered 


Final 


14; Sponsoring Agency Code 


15. Supplementary Notes 

Mr. Frederick Gordon, Technical Monitor 


16. Abstract 


This paper summarizes work completed by, the Statistical Reporting Service, USDA, 
using ground observations obtained from our area frame. This frame is the key 
to making good estimates <?f crop acreages and yield as well as income and live- 
stock. It turns out to be essential for making good use of satellite imagery as 
well. One critical step in using multi-spectral scanner CCT’s is to identify 
and estimate the amount of energy for each crop that is reflected in each band. 
Area frame data can provide unbiased estimates of crops reflectances in the 
whole image, since the data were selected scientifically from the whole image. 

Another equally critical step in making immediate use of satellite imagery is 
needed after the image has been classified. By observing how known probability 
data are classified or more accurately misclassif ied, we can adjust tot.- 1 image 
classifications so that pixels can be converted to acres in a statistically 
sound way . 

The work presently being done fyv SRS directly evolved from this study. The 
project’s intent is to reduce sampling error of the crop acreage estimates. 


17. Key Words Suggested by Author 

ERTS Data Use 
Acreage Estimation 

Crop Acreage and Production Estimates 


18. Distribution Statement 


19. Security Classif. (of this report) 

20. Security Classif. (of this page) 

21. No. of Pages 

22. Price 

Unclassified 

Unclassified 

190 




r,«« d i the LOrd sa l d * ' Behold » the y are one people, and they have all 
* language ; and this is only the beginning of what they will do- 

Col n °?\ lnB th3t , they Pr ° pOSe t0 d ° wil1 impossible for tie*. 

me, let us go down, and there confuse their language, that they 

Sled ^ r thCT ' S 3 <>“‘ h ’- Therefore 8 ? ’l^ neLSa, 

ear£h^ ’ the L ° rd COnfu8ed the language of all the 


And we have been misleading each other ever since." 


Genesis 11:6-9 


Dr. Thomas Szasz, 
The Second Sin , 
1974 


““r^eSoS" 0 ” “ dlSPel SO " e * hl8 aspect 


The Authors 


acknowledgements 

We wish to express our gratitude to the LARS staff for their sunnorr 

Etherid?e e for th“tta? Cla d 8 ° t0 Dr ' Marvln B,uer »" d He. Jeanne 

larsys To ac o co^p?i s “”': u ? n i: d s : d i8 “^:. rCTde « d ^ «- ~m- 

f t8 " d " °“ r Appreciation to our aclentlfic monitor, Mr. 
rede.ic Gordon f or Ms patience during the final draft of this document, 


1 


Preface 


This report provides step by step details of nearly two and one half years of 
work at the Statistical Reporting Service (SRS) under NASA contract AG328. 

The contract specified that we perform crop classification of LANDSAT data 
(formerly ERTS) in four states. All the classification was performed at Pur- 
due using LARSYS. Other systems were tried, but LARSYS was flexible enough 
to suit our needs. 

The basic objective was to evaluate LANDSAT data and to find ways to use this 
data to improve the present acreage estimates. This is no easy task since the 
current estimates are cost effective and sufficient in most areas - the excep- 
tion being local estimates. 

The procedures that were developed were to improve state or strata within 
state estimates. This project is being followed through with 1975-76 program, 
which is to perform wall to wall classification of LANDSAT data in Illinois, 
Kansas, and 44 counties in Texas. 

Specifically, the objectives as presented in the original proposal are: 

1. Develop methods of crop species identification from space imagery by 
photo interpretation and discrimination technique within the context 
of: (1) multiple frame sampling, and (2) an alternative approach 
using the techniques of double sampling. The study would compare the 
accuracy of results using LANDSAT imagery compared with the additional 
improvement using aircraft imagery when both are combined with ground 
data. 

2. Develop methods for estimating crop acreages by extracting informa- 
tion from space imagery in the context of- the agencies operating 
constraints. 

The scope of this ambitious study was somewhat reduced since much of the 
imagery came very late in the growing season. 

Less than optimum imagery was available, so less than optimum results were 
obtained. Nevertheless, the conclusions were that if satellite imagery were 
available and, if software were available, LANDSAT type data could be useful 
and provide substantial gains in state estimates. SRS has moved ahead to 
build software, so that when the imagery is available, SRS will be ready to 
use it. However, it is vital that the data be ready for processing within 48 
hours after it has been taken. Otherwise, it is of little value. 


ii 


Table of Contents 


FOREWORD . 

PREFACE 

TABLE OF CONTENTS 

LIST OF TABLES 

LIST OF ILLUSTRATIONS 

INTRODUCTION 

DATA ACQUISITION 

FLIGHTLINE GROUND OBSERVATION 

LANDSAT IMAGERY 

AERIAL PHOTOGRAPHY 

SOFTWARE AND DATA PROCESSING 

SOFTWARE IMPLEMENTATION 

DATA ANALYSIS 

MISSOURI LANDSAT 

KANSAS LANDSAT .. 

SOUTH DAKOTA LANDSAT 

IDAHO LANDSAT . ... 

RESULTS OF CLASSIFICATION OF AERIAL PHOTOGRAPHY 

CROP ACREAGE ESTIMATION 

COST ANALYSIS 

APPENDIX A - ENUMERATORS' INSTRUCTIONS 
APPENDIX B - STATE OFFICE INSTRUMENTS 
APPENDIX C - GREY- SCALE MAP COMPUTER PROGRAM 

APPENDIX D - DETAILED INSTRUCTIONS FOR MICRODENSITOMETER SCANNING OF 
AERIAL PHOTOGRAPHS 

APPENDIX E - A PROGRAM TO CONVERT PDS MICRODENSITOMETER SCAN LINES INTO 
SAS COMPATIBLE OBSERVATIONS 
APPENDIX F - FIELD EXTRACTION PROGRAM VERSION 1 AND 2 


’’age 
1 
1 i 
Hi 
iv 
xii 
1 

i 2 

19 

26 

34 

41 

43 

47 

55 

70 

82 

87 

90 

123 

129 


a 


iii 


Table 1 


States and Number of Segments in Study Area 
Table 2 

Major Crops Included in LANDSAT Investigation 

Table 3 

Distribution of Number of Fields by Size and Crop 
for the Four County Test Areas in South Central Idaho 

Table 4 

Distribution of Number of Fields by Size and Crop 
for Crop Reporting District 7 - KANSAS 

Table 5 

Distribution of Number of Fields by Size and Crop 
for Crop Reporting District 9 ~ MISSOURI 

Table 6 

Distribution of Number of Fields by Size and Crop 
for Crop Reporting District 6 - SOUTH DAKOTA 

Table 7 

Number of Segments, Tracts, and Fields by Test Site. 

Table 8 

Estimated Acres, Standard Errors, and Coefficients of 
Variation by Crop and Date, IDAHO, 1972 

Table 9 

Estimated Acres, Standard Errors, and Coefficients 
of Variation by Crop and Date, KANSAS, 1972 


iv 


Table 10 


Estimated Acres, Standard Errors, and Coefficients 
of Variation by Crop and Date. MISSOURI, 1972 

Table 11 

Estimated Acres, Standard Errors, and Coefficients 
of Variation by Crop and Date, SOUTH DAKOTA, 1972 

/" \ Table 12 

\ 

Number of Segments Within Flightline by Flightline and by State 
I Table 13 

K j) 

Estimated Totals, between and Within Flightline Components of Variance, 
Standard Errors, and Coefficients of Variation of the Estltmted Totals 
by Crops, Missouri Study Area, August 7-10, 1972 

Table 14 

Estimated Totals, Between and Within Flightline Components of Variance, 
Standard Errors, and Coefficients of Variation of the Estimated Totals 
by Crops, Missouri Study Area, September 11-15, 1972 

Table 15 

Estimated Totals, Between and Within Flightline Components of Variance, 
Standard Errors, and Coefficients of Variation of the Estimated Totals 
by Crops, Missouri Study Area, October 10-13, 1972 

Table 16 

Estimated Totals, Between and Within Flightline Components of Variance, 
Standard Errors, and Coefficients of Variation of the Estimated Totals 
by Crops, Kansas Study Area, September 11-15, 1972 

Table 17 

Estimated Totals, Between and Within Flightline Components of Variance^ 
Standard Errors, an^ Coefficients of Variation of the Estimated Totals 
by Crops , Idaho Study Area, August 7-10, 1972 

Table 18 

Estimated Totals Between and Within Flightline Components of Variance 
Standard Errors, and Coefficients of Variation of the Estimated Totals 
by Crops, South Dakota Study Area, August 7-10, 1972 


v 


19 


fable 

Missouri Aerial Photography 
Table 20 

Kansas Aerial Photography 
Table 21 

South Dakota Aerial Photography 
Table 22 

Idaho Aerial Photography 
Table 23 

Sensor Spectral Band Relationships 
Table 24 

Classification Matrix of Quadratic Discriminant Functions With 
Unequal Prior Probabilities using Data from Three Overflights, 

Missouri Study Area 

Table 25 

Classification Matrix of Quadratic Discriminant Functions With 
Equal Prior Probabilities Using Data From Three Overflights, 

Missouri Study Area 

Table 26 

Marginal Estimate and Difference From Actual Values 

Table 27 

Per-Field Classification Matrix Based on Data 
From Three Overflights 

Table 28 

Classification Matrix Using August 26, 1972, MSS Bands 
4,5, and 7 with Unequal Prior Probabilities 

, Table 29 

Classification Matrix Using Segtember 13* 1972, MSS Bands 
5 and 7 with Unequal Prior Probabilities. 


vi 


Table 30 


Classification Matrix Using October 2 , 1972 MSS Bands 
4, 5, 6, and 7 with Unequal Prior Probabilities 

Table 31 


Comparison of Multitemporal Classification Performance to 
Classification of Single Dates 

Table 32 

Classification Matrix for August 26, 1972, Based on MSS 
Bands 4,5, and 7 Using Equal Prior Probabilities 

Table 33 

Classification Matrix for September 13, 1972 Based on 
MSS Bands 5 and 7 Using Equal Prior Probabilities 

Table 34 

Classification Matrix for October 2, 1972 Based on MSS 
Bands 4, 5, 6, and 7 Using Equal Prior Probabilities 

Table 35 

Comparison of Multitemporal Classification Performance to 
Classification of Single Dates Using Equal Prior Probabilities 

Table 36 

Classification Matrix Using August 26, 1972, MSS Bands 4, 5, and 7 
With Subgroups 2 and 3 as Training Data and Subgroup 1 as Test Data 

Table 37 

Classification Matrix Using August 26, 1972 MSS Bands 4,5, and 7 
With Subgroups 1 and 3 as Training Data and Subgroup 2 as Test Data 

Table 38 

Classification Matrix Using August 26, 1972 MSS Bands 4,5, and 7 
With Subgroups 1 and 2 as Training Data and Subgroup 3 as Test Data 

Table 39 

Classification Matrix Combining Tables 36, 37, and 38 


vii 


Table 40 


Classification Matrix Using August 26, 1972, MSS Bands 4, 5, and 7 

Table 41 

Covariance Matrices and Mean Vectors for Frame 1060-16512 (September 21, 1972) 

Table 42 

Covariance Matrices and Mean Vectors for Frame 1061-16570 (September 22, 1972) 

Table 43 

Classification Matrix for September 21, 1972, MSS Bands 
4, 5, 6, and 7 Using Quadratic Discriminant Functions with Unequal 
Prior Probabilities in Kansas Test Site for Select Fields 

Table 44 

Classification Matrix for September 21, 1972 Imagery (MSS Bands 4, 5, 6, and 7) 
Using Quadratic Discriminant Functions With Unequal Prior 
Probabilities in Kansas Test Site 

Table 45 

Classification Matrix for September 22, 1972 Imagery (MSS Bands 4, 5, 6, and 7) 
Using Quadratic Discriminant Functions with Unequal Prior 
Probabilities in Kansas Test Site for Select Fields 

Table 46 

Classification Matrix for September 22, 1972 Imagery (MSS Bands 4, 5, 6, and 7) 
Using Unequal Prior Probabilities, Kansas, All Fields 


Table 47 


Classification Matrix for September 22, 1972 Imagery (MSS Bands 4, 5, 6, and 7) 
Using Quadratic Discriminant Functions with Equal Prior 
Probabilities in Kansas Test Site for Select Fields 


Table 43 

Classification Matrix for September 22, 1972 Imagery, 4 Bands 
Using Equal Prior Probabilities, Kansas 

viii 


"• a i m* 




Table 49 


Classification Matrix of Select Fields in Frame 1060-16512 Classification, 
Using Statistics from Select Fields in Frame 1061-16570 

Table 50 

Classification Matrix of All Fields in Frame 1060-16512 Classification, 

Using Statistics Generated From Select Fields in Frame 1061-16570 

Table 51 

Classification Matrix of Select Fields in Frame 1061-16570 Classification, 
Using Statistics Generated from All Fields in Frame 1060-16512 

Table 52 

Classification Matrix for September 21, 1972 imagery (MSS bands 4, 5, 6, and 7) 
Using Unequal Prior Probabilities in South Dakota Test Site 

Table 53 

Classification Matrix for September 21, 1972 Imagery (MSS Bands 4, 5, 6, and 7) 
Using Quadratic Discriminant Functions With Unequal Prior Probabilities In South 

Dakota Test Site For Select Fields 

lable 54 

Means and Covariance Matrices for Crops in South Dakota 
On Frame 1060-16491, September 21, 1972 

Table 55 

Preliminary Classification of Idaho Study Area Data Using 
August 1972 Data Bands 4, 5, and 7 and Unequal Prior Probabilities 

Table 56 

Preliminary Classification of Idaho Study Area Data Using 
August 1972 Data Bands, 4, 5, and 7 with Equal Prior Probabilities 

Table 57 

Classification Matrix of Idaho Study Area, August 1972 Imagery 
Using MSS Bands 4, 5, 6, and 7 with Unequal Prior Probabilities 


ix 


Table 58 


Classification Matrix of Idaho with Unequal Prior Probability 
Groups - Table 57 Collapsed into 7 Groups 

Table 59 

Classification of Flightlines 3 and 10 by Segment , Using Quadratic Discriminant 
Functions on all Eight Spectral Variables, Kansas Aircraft, Data, September 1972 

Table 60 

Classification of Flightlines 3 and 10, on All Eight Spectral . 
Variables, Kansas Aircraft Data, September 1972 

Table 61 

Classification of Flightlines 5 and 6, by Segment, Using All 
Eight Spectral Variables, Idaho, September 1972 

Table 62 

Classification of Flightlines 5 and 6, Using Eight 
Spectral Variables, Idaho, September 1972 

Table 63 

Classification of Flightlines 2 and 5 by Segment, Using Eight 
Spectral Variables, South Dakota, September 1972 

Table 64 

Classification of Flightlines 2 and 5, by Segment, Using Eight 
Spectral Variables, South Dakota, September 1972 

Table 65 


Classification of Flightlines 2 and 8, by segment. Using Eight 
Spectral Variables, Missouri, September 1972 

Table 66 


Table of Correlation Coefficients Squared between the Items of Interest 

Table 67 

Acreage Estimates, Variances, Coefficients of Variation 
for Samnle Sizes of 5 and 10 Usine LANDSAT Data 


reproducibility of the 
npifiTKfAL PAGE IS POOR 


X 


Table 68 


Acreage Estimates, Variances, Coefficients of Variation 
for Sample Segments of Size 5 and 10 Without the Aid of LANDSAT Data 

Table 69 

Missouri 1972 JES Time and Mileage Data 
Table 70 

South Dakota 1972 JES Time and Mileage Data 

1 / •'/ 

Table 71 

Kansas 1972 JES Time and Mileage Data 

Table 72 ■' x „ - ' 

Idaho 1972 JES Time and Mileage Data 
Table 73 

Time and Mileage Data for Idaho , by Enumerator 

Table 74 

Time and Mileage Data for Missouri, by Enumerator 

Table 75 

Time and Mileage Data for South Dakota, by Enumerator 

Table 76 

Time and Mileage Data for Kansas, by Enumerator 


xi 




TABLE OF FIGURES 


1. South Dakota Crop Reporting District 6*000 Showing Two Aircraft Flight- 
lines . 

2. Kansas Crop Reporting District 7,000 Showing Two Aircraft Flightlines. 

3. Missouri Crop Reporting District 9,000 Showing Two Aircraft Flightlines. 

4. Idaho Crop Reporting District Showing Two Aircraft Flightlines. 

5. A Typical Segment Divided into Tract and Fields. 

6. Ground Truth Record Form, Kansas Segrnent Number 3086, Tract 3, Third 
Visit. 

7. Sketch of Segment Showing Field Boundaries and Crop Classes. 

8. Conceptualized Mapping from Agricultural Fields into Measurement Space. 

9. Partitioned Measurement Space. 

10. Measurement Space Showing Two Crop Density Functions and an Unknown Point. 

11. Measurement Space Where Crop Types Have Same Covariance Matrix and Slope. 

12. Measurement Space When Crops Have Different Covariance Matrices. 

13. Measurement Space Showing an Outlier and Three Crop Areas with 95Z Confl- . 

dence Limits. 

14. Gray Scale Printout for a Segment Showing How Fields are Defined. 

15. Comparison of Overall Percent Classification by States, 1972. 

16. Comparison of Classification Methods by Crop, Kansas, August 18, 1972. 

17. Comparison of Classification Methods by Crop, Missouri, August 29, 1972. 

18. Comparison of Classification Methods by Crop, South Dakota, September 23, 

1972. 

19. Comparison of Classification Methods by Crop, Idaho, August 12, 1972. 

20. Stepwiee Discriminant Analysis, Classification into Nine Groups, Density 
and Transmission Mode, South Dakota, 1972. 

21. Stepwise Discriminant Analysis Classification into Nine Groups, Trans- 
mission Scanning Mode, South Dakota, 1972. 


xii 


Table of Figures cont. 


22. Stepwise Discriminant Analysis, Classification into Nine Groups, Density 
Scanning Mode, South Dakota, 1972, 

23. Stepwise Discriminant Analysis, Classification into Eight Groups, All 
Variables, South Dakota, 1972. 

24. Stepwise Discriminant Analysis, Classification into Eight Groups, trans- 
mission Mode, South Dakota, 1972. 

25. Stepwise Discriminant Analysis, Classification into Eight Groups, Density 
Mode, South Dakota, 1972. 

26. Stepwise Discriminant Analysis, Classification into Six Groups, Density 
and Transmission Scanning Mode, South Dakota, 1972. 

27. Stepwise Discriminant Analysis, Classification into Six Groups, Transmission 
Scanning Mode, South Dakota, 1972. 

28. Stepwise Discriminant Analysis, Classification into Six Groups, Density 
Scanning Mode, South Dakota, 1972. 

29. Stepwise Discriminant Analysis, Classification into Four Groups, Density 
and Transmission Scanning Mode, South Dakota, 1972. 

30. Stepwise Discriminant Analysis, Classification into Four Groups, Trans- 
mission Scanning Mode, South Dakota, 1972. 

31. Stepwise Discriminant Analysis, Classification into Four Groups, Density 
Scanning Mode, South Dakota, 1972. 

32. Overall Classification Accuracy by Number of Groups and Measurement Mode 
for Four Variables. 


LIST OF CONTRIBUTORS 


V. 


Name 

Status 

Time 

Donald H. Von Steen 

Principal Investigator 

1972-1975 

Harold F. Huddleston 

Principal Research Statistician 

1972-1975 

Paul D. Hopkins 

Statistician & Data Processing 

1972-1975 

Fred B. Warren 

Statistician & Date Processing 

1972-1975 

Edward Camara 

Statistician 

1972-1975 

William H. Wigton 

Statistician 

1972-1975 

Ronald Bosecker 

Statistician 

1972-1973 

John E. Ridgely 

Statistician 

1972-1972 

Ronald J. Steele 

Statistician 

1972-1974 

Chapman P. Gleason 

Statistician 

1974-1975 

Paul W. Cook 

Statistician 

1973-1975 

Edward E . Burgess 

Statistician 

1973-1974 

Clare Fisk 

Writer-Editor 

1975-1976 

Victoria M. Posey 

Typist 

1972-1976 


xiv 


I. Introduction 


The Statistical Reporting Service (SRS), U.S. Department of Agriculture, 
prepares estimates of crops, livestock, poultry, dairy, prices, and 
related agricultural topics. 

Crop reports provide estimates of acreages farmers intend to plant in the 
coming Beason, the acres planted and harvested, production, disposition 
of the crop, and remaining stocks. Forecasts of yield and production are 
Issued monthly during the growing season based on information voluntarily 
provided by farmers and from counts, measurements, and observations made 
in sample fields by SRS enumerators. 

Livestock and poultry reports include estimates of animals on farms and 
ranches or in feedlots. Estimates are made of breeding and production 
intentions; yearend estimates cover production and disposition of major 
livestock and poultry species. SRS also reports slaughter numbers and 
meat production. 

Dairy reports Indicate milk cows, monthly and annual milk production, and 
use of milk. Production of major manufactured dairy products is reported 
weekly and monthly. 

Price reports show prices received by farmers for nearly 200 products and 
prices paid for about 500 items needed for production or family living. 
Reports cover indexes of prices received and paid, parity prices, and 
season average prices of crops, livestock, and livestock products. 

Other reports deal with labor and wages, fertilizer, seeds, bees and honey, 
mink, naval stores, stocks of major commodities, cold storage holdings, 
exports and other agricultural elements. 

The scope of agricultural estimates has increased with the demands for 
information by producers, processors, manufacturers, and Government pro- 
gram planners, but the original goal has remained steady - to help farmers 
market farm products more effectively. 

The launching of ERTS-1 (now LANDSAT) on July 24, 1972 opened a new poten- 
tial source of agricultural data. This investigation has provided SRS 
with an opportunity to evaluate a different source of data relative to 
crop acreage estimates. In addition, there was presented the opportunity 
to determine whether the theory of sampling is flexible enough to utilize 
efficiently satellite data in conjunction with other survey procedures. 

If it were possible to blend these sources, a substantial increase in 
survey accuracy would ensue. 


1 / 

Preparing Crop and Livestock Estimates . Statistical Reporting Service, 
March, 1974. 


1 


The objectives of this investigation were as follows: 

1. Develop methods to identify crop species utilizing satellite and 
aircraft Imagery. 

2. Develop methods of estimating crop acreages utilizing satellite 
imagery. 

3. Within the context of multi-stage and multiple frame sampling, 
develop methods of utilizing all three sources of data (ground, 
aircraft, and satellite) to make crop acreage estimates. Com- 
bining all three sources in a statistical model should result in 
a marked improvement over any one source for making crop acreage 
estimates. 

The study areas were selected Crop Reporting Districts in Missouri, Kansas, 
South Dakota, and Idaho. The major crops of concern were wheat, corn, 
cotton, soybeans, sugar beets, potatoes, alfalfa, and grain sorghum. Some 
of the crops are grown in only one area while others are common to two or 
three. This provided the opportunity to observe crops grown under dif- 
ferent conditions. 


Data Acquisition 

2.1 Ground Observations 

In order to evaluate the new methodology , one needs independently collect- 
ed (control) data. For this study, ground truth collected in the same 
manner as is now being used by the Statistical Reporting Service, (SRS) 
was used as the control data for evaluating results from both the satel- 
lite and aircraft Imagery, 

The thrust of the ground truth portion of the LANDSAT project is to iden- 
tify the crops visible from the air on previously designated areas of land. 
Our ground troth identifies the crop species present and the exact loca- 
tion of the fields for the survey. 

Throughout the growing season, the species, acreage, and condition of 
crops In these fields are observed periodically. This provides progressive 
reports about crop maturity development and a record of any changes in 
acreage or species. This data provides survey acreage for crops which 
could be compared against other sources of data and corresponding estimates 


2 


REPRODUCIBILITY OF THE 
PR1GINAL PAGE IS POOR 


The condition of the crop in each field is noted as supplementary infor- 
mation. During the processing of aerial photography and satellite imagery 
the condition code would, in some cases, provide some basis why a com 
field was classified incorrectly. 

The first enumerative survey was conducted in late Hay and early June of 
1972 by SRS. This data was used as a source of original data and was 
then updated by special enumerators. 1/ However, the estimates of crop 
acreages generated by the JES survey included both crops already planted 
and crops to be planted. At the time of the enumerative survey, the 
wheat in Missouri might still be in the field and was recorded as such on 
the questionnaire. In addition, the farmer’s intention to plant soybeans 
was recorded for that same field. The LANDS AT ground truth was only con- 
cerned with crops and ground vegetation present on the day the enumerator 
visited the segments. For this reason, the June Enumerative Survey (JES) 
acreage estimates could be different from the LANDSAT acreage estimates , how 
ever, provisions were made through the updating of JES so such differences 
could be measured. 

The LANDSAT ground truth was also used as a training device to classify 
aerial photography and satellite imagery. Since the exact location of 
each field and the crop species present in the field was known, we could 
identify the field on the aerial photography or satellite imagery and 
train the computer to recognize and identify all similar fields. After 
identification, a separate estimate of acreage can be generated from these 
other sources of data and compared against the ground truth acreage esti- 
mates. 

2.1.2 Source of Ground Data 

The test areas used in this study were SRS Crop Reporting Districts (CRD) 

A CRD is a contiguous group of counties within a state which have similar 
farming activities. Generally, each state is composed of about nine such 
districts. 

Within each of these CRD’s are randomly selected areas of land (segments) 
that range in size from about one-half square mile to three square miles. 
Since the CRD's are independent strata, estimates can be made for each 
individual strata by multiplying the segment totals by the reciprocal of 
their probability of selection and summing over the Cl®. For the JES and 
the LANDSAT study, these segments are the test sites for the classification 
of the aircraft and satellite imagery. The information obtained from 
these segments on crops present constituted our ground observations. 


U see Appendix B for a list of terms and definitions used for the June 
Enumerative Survey (JES) and LANDSAT fieldwork. 


3 


Ground data was collected for segments in CRD six in South Dakota, seven 
in Kansas, and nine in Missouri. In Idaho, the study area was not a CRD, 
but a land use stratum which Included the intensive agriculture areas of 
Jerome, Minidoka, Twin Falls, and Cassia Counties. The study areas 
within each state were selected since they represented an area with a 
manageable volume of data and a comparable number of segments. 


Table 1 — States and numbers of segments in study area. 


State 

Number of Segments 

South Dakota 

■: ^so 

Kansas 

48 

Missouri 

42 

Idaho 

44 

TOTAL 

184 


The four different test sites (see Figures 1, 2, 3, and 4) were selected 
to fulfill operational objectives. First, we wanted to monitor the pro- 
gressive stages of growth and maturity of the major crop species. The 
original satellite launching data would have allowed monitoring crop 
growth from April through November of 1972. Mature wheat in Kansas could 
be compared to pre-headed and headed wheat in South Dakota with similar 
comparisons being made for other crops. Secondly, we wanted the scattered 
areas to help insure at least some good imagery. Imagery of cloud cover 
over selected areas is useless. Presumably, the distant areas would not 
all be engulfed with Inclement weather as the aerial photography and 
satellite imagery were obtained. Thirdly, we wanted to answer whether or 
not corn in Missouri was spectrally different from corn in South Dakota, 
etc. Fourthly, we wanted to look at several different crops and their 
responses to different locational environments of soil, topography, and 
climate. The four State analysis gives indication of within and between 
State variations necessary prior to operational surveys of this type. The 
major crops Included in the study are shown in Table 2. 


4 








Y i KOSKxen 


B Q/*»JA*3 H 




snsoo 


Nos^ 


jj SNntar/A 


COOMNJJVi 


■> -•■•■- 


REPRODUCIBILITY OF THE 
QBJGIJtfAL PAGE IS PuOR 














-ifc’ , , 


REPRODUdBILITY OF THE 
nwmfNAT, PAGE IS JPOQB- 


BUREAU or MHICVltiiDAl XCONOWiCS 










































7 


REPRODUCIBILITY op the 
ORIGINAL PAGE IS POOR 















































FIGURE 5 PRECEDING PAGE BLANK NOT FILMED 



REPRODUCIBIj JT i Oi iff 

ORIGINAL PAGE IE POOR 


PRECEDING PAGE BLANK NOT FILMED 


f 

. V 

|y r m 


1 

I Is* ■*■:■ ■■-.% 

\ ) 

jj 

Vjl 

■i 

r 





j o o O ' o o c c c o c 


00000000 *“ 

22 




l 




Visit one (base) data was obtained directly from the JES questionnaire, 
which was completed in late May 1972 and/or early June 1972. 1/ The JES 
data was identified and keypunched for all "fields" in the segments. The 
identification of each field was required in order to delete crops which 
might have been reported as fields to be planted at a later date. For 
example, a 20 acre field could be recorded both as wheat, and also as 
soybeans to be planted after the wheat was harvested. Since aerial photo- 
graphy and satellite imagery would record only crops present, the ground 
observation could only correspond to what was in the field at the time 
of visit, and only the wheat would be punched. If the wheat field was 
now soybeans, this change was made during the update. 

After completion of all the ground observations, the four State Statisti- 
cal Offices involved in the LANDSAT study were sent an evaluation form to 
evaluate the computer printout recording form. From the answers to the 
evaluation, the following can be said: 

1. The Form Printout is a workable method of collecting ground observa- 
tions. There might have been a small problem orienting the enumera- 
tors to a different form than the accustomed one. However, with 
training, the transition was short. The enumerators were able to 
record acres and crop species without difficulty. 

2. The crop condition codes were generally adequate, but several sug- 
gestions were made. The suggestions were a) call this "State of 
Growth" rather than "Condition," b) change the grain codes from "pre- 
fruit" to "blade" and "fruit" to "heading," c) remove pasture from 
the hays and code the pastures as lush, grazed, and range, and finally 
d) add the code weedy to fallow. 

3. The new printout format did not create unusual editing or keypunching 

situations. ' 

2.1.4 Average Field Size 

Classification results from LANDSAT imagery indicate that field size may 
have a significant affect on how well the classification might be. Also, 
early reported results by other investigators suggested that relatively 
poor classification was obtained from fields less than 20 acres. Several 
inquiries to the Statistical Reporting Service for information on size of 
field prompted the preparation of a detailed tabulation of fields by size 


1 / 

See Appendix B for a copy of a JES questionnaire and the keypunching 
instructions for the LANDSAT survey. 


12 


and by crop (See Tables 3, 4, 5, and 6). The data for this tabulation 
is from the 1972 SRS JES In the four test sites. It should be pointed 
out that this Information only represents the four test areas. 

In the Missouri test site, 28.7 percent of the fields are 20 acres or 
greater and account for 68 percent of the land area. Thirty-eight per- 
cent of the cotton fields are greater than 20 acres, but account for 73 
percent of the reported cotton acreage. Forty-one percent of the soybean 
fields were 20 acres plus and represents 77.5 percent of the soybean 
acreage. The average size of all fields in Missouri was 17.11 acres. 

South Dakota reported that 92 percent of the corn acreage and 89 percent 
of the oats were in fields larger than 20 acres. Overall, 52 percent of 
the reported fields were greater than 20 acres with an average field size 
of 28.74 acres. The average field size needs to be viewed with some 
caution in that it can be heavily influenced by large or small acreages 
for relatively unimportant land uses such as pasture, farmstead, etc. 

Kansas showed 98.5 percent, 99.1 percent, 98.5 percent, 95.6 percent of 
the corn, wheat, sorghum, and alfalfa acreage respectively, were grown in 
fields larger than 20 acres. Field size should not be a limiting factor 
in identifying these crops in Kansas. Average size of all fields in 
Kansas was 108.31 acres. 

The test area in Idaho contained some large areas of waste and pasture 
which Influenced the average field size and the distribution. About 50 
percent of the corn was planted in fields larger than 20 acres. Eighty- 
five percent of the barley was in 20 acre plus fields. Ninety-four per- 
cent of the potatoes were contained in fields larger than 20 acres . About 
65 percent of the sugar beets were grown in 20 acres plus fields. 

If field size is a factor in one's ability to do crop classification, the 
results in Kansas should be substantially better than in the other three 
states. Field shape may be a greater limiting factor than size, parti- 
cularly in areas which contain Irregular fields. 

2,1.5 Timing and Workload of Fieldwork 

Because of the delay in the launch of LANDSAT-1, the update surveys did not 
begin until August 1972. Prior to the first visits, a training school 
was conducted in each State involved. The training was to 1) instruct 
State Statistical Offices (SSO) personnel regarding enumeration, editing, 
keypunching, and mailing procedures, and 2) instruct enumerators regard- 
ing the collection of ground observations. 1/ 


1 / 

See Appendix A for Enumerator Instructions, for Ground Observation 
Editing Instructions, and B for Ground Observation SSO Keypunching Instructions. 


13 
























































The timetable -for the collection of the ground observations was : 


August 3 
August 7-11 
August 11 
August 14-17 
August 17 
August 24 
August 25 
September 9 
September 11-15 
September 15 
September 18-21 
September 21 
September 27 
September 28 
October 7 
October 13 
October 16-19 
October 19 
October 27 


Enumerator training schools 
Survey fieldwork 

Enumerators mall update survey forms to SSO's 

SSO’s edit forms and keypunch data 

SSO's mail forms and data cards to Washington, D.C. 

Form printout run for next survey fleldyork 

Printout sent to SSO's 

Enumerators receive printout 

Survey fieldwork 

Enumerators mail forms to SSO's 

SSO's edit and keypunch updates 

SSO's mail forms and data cards to Washington, D.C. 

Printout run for next survey fieldwork 

Printout sent to SSO's 

Enumerator receive printout 

Enumerators mail forms to SSO's 

SSO's edit and keypunch updates 

SSO's mail forms and data cards to Washington, D.C. 
Final printout run 


Although the data was not summarized monthly, it would have been possible 
to do so after the summarization program had been implemented. After 
implementation of the summary program, it would have been possible to have 
summarized the data within 14 days from completion of fieldwork. 

For each survey period, enumerators observed about 3,800 fields and 
recorded the data on about 1,100 forms. Because of this volume, the 
computer generated survey form was a necessity. The numbers of segments, 
tracts, and fields observed on each update survey are shown in Table 7. 


Table 7 — Number of segments, tracts, and fields by test site. 


State 

Number of 
Segments 

Number of 
tracts 

Number of 
Fields 


South Dakota 

50 

217 

860 


Kansas 

48 

274 

854 


Missouri 

52 

284 

872 


Idaho 

44 

311 

1358 


TOTAL 

194 

1086 

3844 




2.1.6 Summarization of Ground Observations 


Since these segments were selected at random within a CRD, an expansion 
Is possible to estimate totals for the CRD. The following estimator 
could be used. 

- n " A 


1-1 13 


N 

Where F * — (the inverse of the probability of selection) and N ■ total 

number of sampling units in the test site, and n = the number of sampling 
units in the sample, and is the acreage of the jth crop in the 1th 

sampling unit. 


The standard error of Yj is [Se(y^)3 


where: SeCy^) 


The coefficient 


The update observations were summarized in the same manner as the JES. 
Estimates of the acreages, standard errors, and coefficients of variation 
by crop and date are included in Tables 8, 9, 10, and 11. The Coefficients 
of Variation, which are measures of the relative precision of the estimates, 
ranged from about 10 percent to 100 percent, depending upon the particu- 
lar crop or land use being estimated. For most major crops the C.V. 's 
were around 16 to 30 percent. On the other hand, crops which are not very 
important to that area and which were found in only one field in the 
selected JES segments had C.V. 'a of around 100 percent. 

2.1.7 Flightline Ground Observations 

Flightline Selection: Each of the four study areas was divided into 

flightlines such that all flightlines in a single study area were of the 
same width. The width of the flightlines was limited to the swath width 
of the RB-57 and U-2 aircraft photo coverage and varied from 8 to 12 
miles, depending on the area. Two flightlines in each study area were 
then selected at random, without replacement. The approximate locations 
of the selected flight lines are shown in Figures 1, 2, 3, and 4. 


n 2 n 2 

n E (Fy ,.r - ( Z Fy..) Z 

i-1 i-1 


\| - 1 

A 

of variation (C.V.) - Se(y^) x 100 


19 




CM 

o\ 



U-J 

O 

® 


u 

S 



i 






Table 9 — Estimated acres, standard errors, and coefficient of variation by crop and date, KANSAS, 1972. 







Estimated acres, standard errors, and coefficient of variation by crop and date, SOUTH DAKOTA, 1972 



















Each flightline contained a number of sampling units (JES segments) . 

Even though the segments already existed before the flightlines were 
constructed, their appearance in the sample was still random. The number 
of segments within each flightline varied by flightline and state - Table 
12. Once it was determined which segments fell within the flightline, a 
count of all other possible segments in the flightline was made, thus 

the probability of a segment being selected is within the ith flight- 

M ± 

line. 

This is a multistage sample design where the selection of flightlines is 
the first stage and the second stage of selection is the segments. 

Whereas in this particular case, maps were used as the frame to select 
the sample, it might have been possible to select a similar sample using 
ERTS imagery or aerial photography. 

Table 12 — Number of segments within flightline by flightline and by state. 


State Flightline 

Number of segments 

Added for classifier 
training 

Missouri 1 

2 

2 5 

8 6 

Kansas 1 

2 

5 2 

9 3 

South Dakota 1 

2 

4 6 

5 4 

Idaho 16 6 

2 9 10 


Flightline estimates for the study areas are explained below. 

The estimates of totals for a two-stage sample design are as follows: 

y . i ? a* m± 

k n i=l m ? ^ijk 

1 3 


24 



where: Y^ is the estimate of the total acreage of the kth crop or 

characteristic within a study area, 

N is the total nunber of flightlines f 

n is the number of flightlines in the sample, 

M 

i is the total number of segments within the selected flight- 
lines, 

m i is the nunber of selected segments within the selected flight- 
lines. 


The variance of Y, is: 
k 


var(Y k ) 


■ n V M rV i* 

N ' SC * n 2 1 M k2i 

N n n i-1 M i m i 


N 2 (^2) s ki +- - r 


where: S 2 - l <V Y)2 

kl i-i —^r~ 


m. 


«L - / ( VV 2 


k2i 


j-1 


B fl 


and C.V. - J Var(Y) (100) 

A 

Y 


2.1.8 Results of Flightline Ground Observations 

As would be expected from a sample of size 2 from the heterogeneous study 
areas, the flightline estimates in all four States were not very reliable. 
Coefficients of variation, the measures of precision of the estimates 
ranged from 20 to 100 percent. For most crop, the between 
flightline component of variance was the largest contributor to the total 
estimated variance. Therefore, if the computed variance components are 
any indication, the easiest way to reduce the variance would be to add 
more flightlines. 


25 



In several case- rhe CRD estimates from the flightline ground observa- 
tions compare favorably with those from the JES, but the size of the 
’ standard error would indicate that this is due to chance. Flightline 
estimates of total acres by crop, the estimated between and within 
flightline components of variance, and standard errors, and coefficients 
of variations of the estimated totals are shown by study areas in Tables 
13-18. Generally, these computations were made only for the crops which 
were classified from the LANDSAT and aircraft imagery. However, some of 
the crops shown were not included in the LANDSAT or aircraft classifica- 
tion. 

For Missouri, all three of the update ground surveys were tabulated for 
the selected flightlines. This was to correspond with the occurrence of 
useable LANDSAT imagery from each of the months, August, September, and 
October. The only significant changes in estimated totals occurred 
between the September 11-15 and October 10-13 ground surveys. These 
changes occurred as cotton and soybeans were harvested, causing the 
use to change from those crops to idle (stubble) land or fallow (plowed) . 
The only flightline totals shown for the other three study areas are for 
the update survey periods August 7-10 (Idaho and South Dakota) and Sep- 
tember 11-15 (Kansas). 

Except for Kansas, the between flightline variance component was based 
on all flightlines in the study area in order to get a reasonable esti- 
mate of this variance. 

The main conclusion to be drawn from the flightline ground observation 
analysis is that in order to get reliable estimates from this multi-stage 
sampling approach, more flight lines are needed but is is not necessary 
that they cover such a wide swath. Also, in constructing flightlines, 
the total size (length times width) of the flightlines should be kept as 
equal as possible. For example, flightline 2 in Missouri is much smaller 
than flightline 8. This variation in size can contribute significantly 
to the overall precision of the estimates. 

2.2 Data Acquisition - LANDSAT Imagery 
2.2.1 Ob.1 ectives 

Satellite imagery required for this project included both the computer 
compatible MSS digital data tapes and various types of photographic 
images. 

The photographic images were required to: 

1. determine if a particular LANDSAT frame was usable, and 

2. to assist in locating individual test sites (segments) in the frame. 


26 



Table 13~Estimated totals, between and within flightline components of 

variance, standard errors, and coefficients of variation of the 
estimated totals by crops, Missouri Study Area, August 7-10, 197Z 





e tween 
Acres Flightline 

for CRD Variance 


Coefficient 
Flightline Standard Q f 
Variance Error Variation 


Cotton 

Corn 

Fruit 

Soybeans 

Grain 

Sotghum 

Other 

Hay 

Clover 

Farmstead, 

etc. 

Pasture 

Fallow 

Idle 


309,096 79,585,100,901 1,760,246,861 285,211 92.3 

82,602 2,676,330,320 253,508,210 54,127 65.5 

33,390 929,039,587 185,838,406 33,390 100.0 

1,533,204 1,174,865,916,260 52,150,144,244 1,107,707 72.2 


20,790 


.,070,314 22,424,259 8,573 41.2 


199,800 33,265,369,332 1,339,221,600 186,023 

30,240 762,017,518 152,409,600 30,240 


517,716 

276,606 

46,746 

249,030 


80,235,265,331 6,671,987,712 294,800 

29,869,978,560 2,352,207,811 179,581 

1,820,917,590 83,905,414 43, 64^ 

33,103,771,942 1,234,579,858 185, 30( 


100.0 


64.9 * 


* The between flightline variance is based on all flightlines. 


27 







Table 14 — Estimated totals, between and withxn flight line components of 

variance, standard errors, and coefficients of variation of the 
estimated totals by crops, Missouri Study Area, September 11-15 


Crop 


Estimate 
A 

for CRD 


Between 

Flightline 

Variance 


Coefficient 
Standard 0 £ 

Error Variation 


Cotton 

309,096 

79,585,100,901 

1,760,246,861 

285,211 

Corn 

82,602 

2,676,303,320 

253,508,210 

54,127 

Fruit 

33,390 

929,039,587 

185,838,406 

33,390 

Soybean 

1,533,204 

1,174,865,916,260 

42,150,144,224 

1,107.707 

Grain 

Sorghum 

20,790 

51,070,314 

22,424,259 

8,573 

Other 

Hay 

178,200 

26,452,114,920 

1,471,219,200 

167,103 

Clover 

30,240 

762,017,518 

152,409,600 

30,240 

Farmstead , 
etc. 

517,716 

80,235,265,331 

6,671,987,712 

294,800 

Pasture 

276,606 

29,896,978,560 

2,352,207,811 

179,581 

Fallow 

68,346 

526,723,656 

87,015,814 

24 s 774 

Idle 

249,030 

33,103,771,942 

1,234,579,858 

185,306 


* The between flightline variance is based on all flightlines 









* The between flightline variance is based on all flight lines 


56.9 

64.9 * 
36.2 
80.7 










Table 16 — Estimated totals, between and within flightline components of 

variance, standard errors, and coefficients of variation of the 
estimated totals by crops, Kansas Study Area, September 11-15, 1972 


Crop 

Estimated 

Acres 

for CRD 

Between 

Flightline 

Variance 

Within 

Flightline 

Variance 

Standard 

Error 

Coefficient 

of 

Variation 

Alfalfa 

152,390 

18,578,718,288 

2,437,398,690 

144,969 

95.1 

Pasture 

3,208,195 

88,735,891,538 

223,622,366,804 

1,054,031 

34.9 

Corn 

1,146,690 

15,166,828,880 

3,634,212,685 

137,117 

99.6 

Grain 

Sorghum 

1,146,070 

129,998,137,680 

182,470,351,826 

558,989 

48.8 

Winter 

Wheat 

29,045 

674,889,620 

168,722,676 

29,045 

100.0 

Fallow 

1,086,780 

215,995,641,680 

30,751,534,531 

495,938 

45.7 

Sugar 

Beets 

46,255 

1,711,620,020 

85,585,359 

42,393 

91.7 ' 














Table 17— Estimated totals, between and within flightline components of 

variance, standard errors, and coefficients of variation of the 
estimated totals by crops, Idaho Study Area, August 7-10, 1972. 


Crop 

Estimated 

four 

co. acres 

Between 
Flight line 
Variance 

Within 

Flightline 

Variant 

Standard 

Error 

Coefficient 

of 

VflT"f at- i nn 

Corn 

106,909 

359,570,842 

489,692,090 

29,142 

27.3 

Barley 

77,572 

4,533,887,611 

276,569,321) 

69,501 

89.6 

Winter 

Wheat 

39,754 

1,165,824,646 

109,949,320 

35,718 

89.9 

Mixed 

Grain 

31,713 

407,075,135 

65,268,794 

21,733 

68.5 

Spring 

Wheat 

30,090 

488,902,450 

83,624,281 

23,928 

79.5 

Potatoes 

109,054 

1,499,274,753 

465,840,844 

44,326 

40.6 * 

Field 

Beans 

57,071 

2,023,502,496 

334,116,679 

48,555 

85.1 

Alfalfa 

203,120 

5,369,614,922 

4,002,751,528 

96,811 

47.7 

Sugar 

Beets 

91,019 

403,669,920 

511,370,942 

30,249 

33.2 

Farmstead, 

etc. 

139,227 

1,789,834,480 

284,669,027 

45.547 

32.7 * 

Pasture 

827,398 

368,272,987,764 

15,481,398,570 

619,479 

74.9 

Fallow 

192,285 

13,256,360,708 

618,122,640 

117,790 

61.3 * 

Idle 

133,502 

3,211,847,210 

652,799,570 

62.166 

46.6 


* The between flightline variance is based on all flightlines. 


31 




m mam 





Table 18 — Estimated totals, between and within flightline components of 

variance, standard errors, and coefficients of variation of the 
estimated totals by crops, South Dakota Study Area, August 7-10 


— — ■ ■ ■ ■ '——I 

Crop 

Estimated 
Acres 
for CRD 

Between 

Flightline 

Variance 

Within 

Flightline 

Variance 

Standard 

Error 

Coefficient 

of 

Variation 

Corn 

908,350 

51,266,009,138 

3,367,476,285 

233,738 

25.7 * 

Flax 

35,335 

1,056,185,780 

264,116,571 

36,335 

100.0 

Fallow 

111,055 

1,568,220,500 

616,904,097 

46,745 

42.1 

Pasture 

759,550 

48,067,051,520 

5,193,126,612 

230,782 

30.4 

Sudex 

61,510 

1,971,303,680 

310,693,399 

47,770 

77.7 

Alfalfa 

272,720 

4,805,000,000 

919,799,185 

75,662 

27.7 

Idle 

788,525 

25,315,726,472 

2,401,924,450 

169,463 

21.5 * 


* The between llightline variance is based on all flightlines. 
















L 


/ 

The computer compatible data tapes were used: 

1. to generate the grey-scale computer printouts needed in locat- 
ing the individual segments (and fields) within the LANDSAT 
frame, and 

2. as data input into the computer crop classification routines. 

2.2.2 Approach 

Photographic imagery obtained from NASA included 70mm positive and nega- 
tive transparencies and system corrected 9.5” positive B&W transparencies 
for all LANDSAT frames which include (1) any part of one of the four sites, 
and (2) any part which had less than 50 percent cloud cover. Precision 
9.5 color composite photographs were also ordered, but not analyzed. 

Enlargements (1/250,000) of the composite photographs for selected frames 
were obtained from the ASCS photo lab in Salt Lake City. 

System corrected MSS digital tapes were also obtained for all frames 
having less than 50 percent cloud cover. 

* 

2.2.3 Evaluation 

We received LANDSAT 70mm transparencies and the system corrected ditigal 
data tapes as a standing order. The first digital data tapes were 
received November 1, 1972. Tapes received between November 1 and Novem- 
ber 16 included scenes taken as early as August 15. After November 16, 
tapes generally were received about four weeks after the scene was taken. 

The Initial delay in receiving data tapes was serious only in that 
various computer programs could not be tested operational , until at least 
one set of tapes has been received. 

In retrospect, a more desirable procedure would have been to place a 
standing order for either 9.5 inch or 70mm transparencies of all LANDSAT 
frames which covered any part of a target area. Then, a selection of 
data tapes to be ordered could have been made from these transparencies. 

This would have effected a substantial reduction in the number of data 
tapes received and stored, but essentially unused because of incomplete 
clould-free coverage over a given site during a particular cycle. 

i 

The 1/250,000 scale color enlargements to selected LANDSAT frames were 
used to visually locate specific training sites in the LANDSAT frame. 


33 


2.3 Data Acquisition - Aerial Photography 

2.3.1 Objectives 

High altitude photogrpahy was acquired from NASA and the South Dakota 
Remote Sensing Institute (SDRSI) to meet the following objectives. 

1. Develop methods of crop species identification from aerial 
photography by computer classification techniques, and compare 

* the results with the ground data and with the results obtained 
using LANDSAT imagery. 

2. Estimate crop acreages by expansion of classification results 
to the flightline level and crop reporting district level. 

3. To assist in the location of segments on the LANDSAT frame or 
printouts. 

2.3.2 Approach 
Flightline Selection 

Adjacent, non- overlapping flightlines were drawn on aeronautical charts 
to provide complete coverage of the land area within each of the four 
LANDSAT test site areas for this project. The flight lines constructed 
were 8-10 miles wide and sufficiently long to traverse the full length of 
the test site. Within each LANDSAT test site, two flightlines were ran- 
domly selected for aerial photography overflights. NASA provided high 
altitude, color positive, infrared aerial photography (9 inch format) for 
both selected flightlines for each LANDSAT test site. Attempts were made 
to coordinate overflight dates for the aerial photography with the LANDSAT 
imagery. NASA provided aerial photography on two separate dates for the 
Kansas, South Dakota, and Missouri test sites, and three dates for the 
Idaho test site. The South Dakota Remote Sensing Institute also provided 
photographic coverage (70mm color positive, infrared) for the selected 
flightlines in the Kansas, South Dakota, and Missouri test sites for one 
overflight date. Photographic check-in procedures were as follows: 


1. Locate, delineate, and identify all JES segments and training 
segments on the aerial photography from County Highway maps. 

2. Record the frame number or numbers each segment is located on. 
Tables 19-22 summarizes the photographic coverage for each segment. 


34 


2.3.3 Scanning Procedures 


The JES segments were scanned on a microdensitometer with an effective 
aperture size of 240 microns square. Reduction of the volume of data 
was one of the primary considerations which lead to the choice of such 
an aperture. Using this aperture, one data point covers a land area 
approximately 95 feet square on the NASA photography. Each segment was 
scanned with a clear, red, green, and blue color filter and in two scan- 
ning modes. Thus, multivariate observations are obtained for each data 
point. Prior to actual scanning of the photography, it was necessary to 
record coordinates of corner points of fields and field boundaries to 
identify training data for the classifier. A sketch of each segment was 
made from large aerial maps (scale: 8" - 1 mile) showing each field (small 
land area devoted to one crop species or agricultural practice). Field 
boundary coordinate information was recorded on these sketches. Figure 
7 is a simplified sketch of a JES segment with field boundary coordinates 
recorded. Appendix D contains detailed instructions for the scanning 
procedures . 

Data Conversion and Preparation for Classification 

Output data from the microdensitometer is stored on magnetic tapes. Each 
file on the magnetic tape corresponds to one segment scanned with one 
color filter and recorded in one scanning mode. In order to obtain mul- 
tivariate observations for each data point, a software program, PDSCMS 
(Appendix E) , was developed to merge the data from several microdensito- 
meter output files, each file corresponding to a scanning mode filter 
combination into one file which was compatible with the Statistical 
Analysis System (SAS). 

In order to perform crop classification using discriminant analysis, it 
is necessary to "train the classifier." To facilitate automated assign- 
ment of training data for each crop class, a software program was developed. 
The program generated SAS program statements to assign tract and field 
identifiers to data points on the basis of the coordinates of each pixel. 

The program assigns these labels only to data points contained within user 
defined rectangles whose sides are parallel to the scanning axes. The 
tract and field identifiers were then used to merge the microdensitometer 
data with the ground information collected during the 1972 growing season. 

The ground information that was collected monthly included the crop spe- 
cies and crop condition for each tract and field within the segment. The 
crop condition and crop species was used to form the group for classifica- 
tion vlth discriminant analysis. Thus, an observation vector in the merged 
data set contains the following information. 

1. The value of relative light Intensity for each of two 
scanning modes and four filter combinations, 


35 


Figure 7 — Sketch of Segment Showing Field Boundaries and Crop Classes. 


.( 0 , 0 ) 


(-5420.0) (-5020.0) 


(-9310.0) 


Corn 


(-2630,-4660) 


(“2550, 

-5670) 

(0,6830) 



(0,-10540) 

(-2580,-10540) 


(-5420,-4660) 


Fallow 


Pasture 


•g Alfalfa 

£ 

(-5920,-9700) 


(-6890,-9270) 

' - “V9310,-5300; 


(-9310,-230) 


(-5920,-5670) 


Plowed 


(-13230, 

-5230) 


(-9280,-8700)* 


Fallow 


(-13140,-8700) 


(-5870,-9080) 


(-9290,-9100) 


(-5500,-10820) 


(-9280,-10600) 

Fallow 


Winter Wheat 
Harvested 


(-13140, -1158( 


Sugar Beets 


(-1720,-13760) 


(-9330,-13715) 


(-13143,-13715) 


TABLE 19 


MISSOURI AERIAL PHOTOGRAPHY 


Mission; Date* 

^08;' 8/78772 T 

HIT 

' $719772 ?s 

.D.R.S.I.; 9/19-20/72 
4 filters 
Frame No . 

Camera, Roll : 
Segments t 

RC-8; 33 : 
Frame No. : 

ZEISS: 34: 
Frame No. : 

RC-8; 

Frame 

42 :ZEISS; 44: 
No. :Frame No. : 

F.L. 2 







4418 

29 

— 

99 

— 

38 & 39 

4420 

31 

55 

98 

25 

42 


F.L. 8 







4411 

05 

7 

127 


3 


3412 

07 

— 

124 


28 & 29 

1413 

07 

12 

124 

78 

19 - 25 

4414 

04 

6 

128 

84 

6 & 7 


1435 

13 

22 

120 

69 

9 


3436 

10 

17 

122 

73 

2 


4458 

11 

— 

121 




4460 

08 

16 

123 

76 

32 & 

33 

Extra 







3416 

28 

— 

— 




4417 

30 

53 

98 




4419 

29 

— 

99 




3432 

15 

— 

118 




4434 

12 

— 

120 




4437 

10 

— 

123 




Training 







2A1 

31 

55 

97 

23 

44 & 

45 

2A2 

30 

55 

98 

24 

47 - 

53 

2B 

29 

— 

100 


29 & 

30 

2C 

29 

— 

99 


33 


2D 

28 

49 

100 


25 & 

26 

8A 

05 

6 

128 

85 

11 & 

12 

8B 

07 

12 

125 

79 

14 & 

15 

8C 

11 

18 

121 

72 

37 & 

38 

8D 

11 

19 

121 

72 

5 & 

6 

8E 

12 

— 

120 


41 & 

43 20 & 21 

8F 

15 

26 

118 

64 

15 & 

16 




IB s? 


37 


TABLE 20 


KANSAS AERIAL PHOTOGRAPHY 

208 : 8718/72 : 211; 9/17/72 :S.D.R.S.I.; 8/12-14/72 

Camera; Roll :RC-8; 1 :ZEISS; 3 :RC-8; 33 :ZEISS; 35:4* Filters of 4 rolls Each 
Segments :Frame No. : Frame No. : Frame No.:Frame No.: Frame No. 


F.L. 3 


4087 

41 

— — 

1089 

43 

85 

4101 

48 

95 

3106 

37 

72 

4107 Noc 

34 

66 

1113 

53 

107 

4114 

50 

100 

1115 

40 

79 

3116 

41 

81 

F.L. 10 

4120 

14 

26 

3122 

24 

48 

4124 

18 

35 

1125 Noc 

Noc 

— — 

4130 

22 

43 

Extra 

4088 

44 


Training 

3-A 

50 

101 

3-B 

36 

70 

3-C 

37 

72 

3-D 

40 

81 

3-E 

40 

81 

3-F 

42 

83 

3-G 

42 

83 

3-H 

43 


3-1 

43 

85 

3-J 

43 

87 

3-L 

46 

— 

3-M 

47 

— 

3-P 

54 

109 

10-A 

24 

— 

10-E 

9 

17 


19 

— 

B26 - 31 

17 

29 & 271 

— 

13 

20 & 280 

A27 - 30 

23 

259 

B 1 - 5 

27 

— 

C40 

07 

08 & 291 

A53 - 56 

10 

16 & 285 

A34 - 38 

21 

265 

B12 - 15 

19 

268 

B22 



MM 

D12 - 16 

MM 

MM 

C23 - 26 

MM 


C 1 - 8 

MM 

U«M 

- 



MM 

C17 - 19 


17 




10 

14 & 286 

A42 


25 

45 

C36 


24 

260 

- 


20 

266 

B 7 & 

8 

20 

267 

B18 


19 

32 & 269 

B39 - 

51 

19 

32 & 269 

B57 - 

64 

17 

— 

B36 


17 

30 & 272 

A 3 


17 

28 & 273 

A 8 


14 

— 

A15 


13 

— 

A18 


06 

07 & 293 

A49 & 

50 

MM 

— 

C32 


MM' 

__ 

D 2 - 

5 


Note: 


RC-8 and ZEISS coverage of segments 1113, 4114, and 3A are also available 
from Mission 217 dated 10/24/72. 


TABLE 21 


SOUTH DAKOTA AERIAL PHOTOGRAPHY 


Mission; bate:_ 

211 ; vrmn 

1 2TTT" 

9/14 Hi rS.D.R.S.I.: 8/27/72 

Camera, Roll : 
Segments : 

RC-8; 54 
Frame No. 

: ZEISS; 56 
: Frame No. 

:RC-8; 17 
: Frame No 

: ZEISS; 19:4 filters and 4 foils 
. : Frame No.: Frame No. 

F.L. 3 




None 

3196 

2934 

70 


46 

4197 

2932 

66 


54 

1199 

2934 

71 


50 

4210 

2930 

62 


5 & 6 

F.L. 5 




None 

1213 

2908 

18 

188 

26 

1223 

2912 

27 

184 

14 

3236 

2906 

14 

191 

35 

4237 

2906 

— 

191 

32 

4240 

2915 

— 

181 

8 

Extra 




None 

1195 

2934 

— 



4198 

2933 

69 



4208 

2928 

— 



4211 

2928 

— 



3212 

2909 

— 

187 


4214 

2908 

20 

188 


3222 

2913 

— 

— 

22 & 23 

4224 

2912 

27 

184 


1235 

2906 

— 

190 


1239 

2918 

— • 

179 


4241 

2918 

39 

179 


Training 




None 

3-A3 

2930 

62 


1 

3-B-9 

2933 

68 


53 

3-C-3 

2935 

— 


44 

3-C-5 

2935 

72 


48 

3-v-$ 

2935 

— 


41 

3-1-8 

2935 

74 


38 

5— C— 2 

2913 

27 

184 

12 

5-C-3 

2913 

29 

184 

20 

5-C-4 

2913 

29 

— 

16 

5-E-2 

2908 

17 

189 

29 


39 


TABLE 22 

IDAHO AERIAL PHOTOGRAPHY 


Mission; Date 
Camera 

Segments 

: 72-138; 8/11/72 : 

9/7/72 

10/25/72 

„ RC-8„ 

: Frame No. : 

Frame C ”8o. 

_ RC-8 „ 
Frame No . 

F.L. 5 




8101 

4702, 4812-13 

3885-86, 3900-01 

5565-66, 5652-53, 5820 

8103 

- 

3881 

5647 

8111 

4699-4700, 4814-15 

3884-85, 3902-03 

5650-51 

3423 

4816 

3904-05 

— 

1554 

4699,4814-15 

3883-84 

5650-51 

1559 

4699,4815-16 

3883-84, 3903-04 

5650, 5667 . 

F.L. 6 




8094 

4812 

3900-01 

5664, 5822 

8098 

4811-12 

3899-3900 

5817, 5663-64 

8109 

4813 

3886, 3901-02 

5665 

9110 

4700,4814 

3884-85, 3901-02 

5661, 5665-66 

8113 

4814-15 

3902-03 

5666-67 

8265 

4816 

-3904-05 

5668 

2332 

4811-12 

3899-3900 

5663, 5817 

8339 

4816 

3904-05 

5668 

3422 

4812-13 

3900-01 

5664, 5821-22 

Extra 




8096 

4703 



8099 

4701 

1 


8102 

4701 



8112 

4814 

3902-03 

5666-67 

8115 

4701 



1549 

4702 



1550 

4702 



Training 




5-A-2 

4702-03, 4810-11 

3887-88, 3899 

5654, 5817-18-19 

5-B-2 

4702,4812-13 

3886-87, 3900-01 

5653, 5820 

5-C-2 

4814-15 

3884-85, 3902-03 

5665-66 

5-D-2 

4699-4700, 4814-15 

3884-85 

5650-51 

5-K-5 

4815 

3903-04 

5667 

5-K-6 

4815 

3903-04 

5667 

6-C-2 

4812-13 

3900-01 

5664-65, 5821-22 

6-D-l 

4812-13 

3900-01-02 

5664-65, 5812-22 

6-F-3 

4813-14 

3901-02 

5665-66, 5821 

6-F-4 

4813-14 

3901-02 

5665-66, 5821 

6-H-l 

4814 

3901-02 

5665-66 

6-H-2 

4814 

3901-02 

5665-66 

6-1-1 

4814 

3902-03 

5666-67 

6-1-2 

4814 

3902-03 

5666-67 

6-J-4 

4814 -15 

3902-03-04 

5666-67 

6-L-4 

- 

3902 

5665-66, 5822-23, 9095 


AO 


2. The x,y - coordinates, 

3. The tract, field number, 

4. Crop and crop condition on four month visits. 

There are eight spectral variables, two spatial variables, and four label 
variables making up each pixel. 

III. Software and Data Processing 

3.1 Segment and Field Location 

3.1.1 Objectives 

A primary objective of this phase of the project was to develop proce- 
dures which would enable the user to locate small areas in LANDS AT images 
that are identified on maps. These areas must be identified with great 
accuracy if they are to be used either as training sites or discriminant 
analysis or as test sites on the estimation procedures. 

3.1.2 Approach 

The method used to find segments and field boundaries was mostly a manual 
operation. The procedure is outlined below. 

1. The exact location of the individual JES segments was drawn 
on county highway maps. 

2. The approximate locations of the JES segments on 1/250,000 
scale color enlargements of the LANDSAT frame were determined 
by a visual comparison of the enlargement with the county 
highway maps. 

3. Grey scale maps of large areas around the location of each 
segment were generated from computer compatible MSS tapes. 
Generally, these maps were from response band 5. 

4. Visual correlation of features distinguishable on the county 
highway maps, on the color enlargements of the LANDSAT imagery 
and on the grey scale computer printouts was used to find 

the location of individual segments in the LANDSAT frame. 

Field boundaries had been drawn on T'/SbO* scale aerial photographs of 
the JES segments. These photographs were then used as a basis for sketch- 
ing the field boundaries on the computer grey scale printouts. Next, an 
area definition card was punched for every scan line that crossed each 
field. A more detailed description of this procedures is included in 
Appendix C. 

Two different computer programs were used to produce grey level maps. The 
first was called NMAP and is from the Penn State Classification System. 

This system had several good points. 


41 


1. It could map any combination of channels to a maximum of 16 
channels . 

2. It can produce grey-level maps with variable proportion of 
points in an Interval. 

3. It can use either LANDSAT or LARSYS III format tapes as input 
Some of NMAPS disadvantages are: 

1. It requires a format conversion run, 

2. It must do a map to obtain initial grey level response histo- 
gram. 

To speed up the mapping process, a second mapping program RAD MAP was 
developed. It has the following advantages over NMAP. 

1. It maps at a faster rate. 

2. It can sample to determine the response histogram and set 
the grey levels accordingly. 

The major disadvantages are: 

1. It will only map one band at a time. 

2. It Is limited to LANDSAT computer compatible tapes, 

3.1.3 Evaluation 

The segment location procedure described here was reasonably effective 
in southwestern Kansas and in the Snake River Valley of Idaho. These 
areas were characterized by a regular 'checkerboard* road pattern, 
moderately large regular fields, and by a number of crops which had dis- 
tinctly different reflectance patterns. We had more difficulty in east 
central South Dakota and in southeastern Missouri. The principal pro- 
blem in South Dakota was that, at the time the LANDSAT imagery was taken 
crops seemed to look much the same. Also, there were not many of the 
distinctive field patterns as were found in Kansas. Missouri was charac- 
terized by Irregular road and field patterns and by heavy woodlands which 
helped to hide the roads. 

A more fully automated procedure is needed for any further work in this 
area. Among the possibilities for inclusion in such a procedure could 
be the following. 


42 


1. A program which could compute the approximate location of 
test sites in a given LANDSAT frame. 

2. The use of affine transformation to locate points in small 
areas of the LANDSAT frame (CITARS, F.G. Halit HUE. Bauer, 

W.A. MALILA). 

3. A grid digitizer to convert map boundaries to a series of 
data points which could be converted to LANDSAT frame coor- 
dinates. 

3.2 Software Implementation for Crop Classification 
3.2.1. Objectives 

The main objective was to find and install in the USDA Washington Computer 
Center a series of computer programs to perform discriminant analysis 
(pattern recognition). In addition, the following related objectives 
should be satisfied. 

A) The software should be relatively easy to install and maintain. 

B) The system should use a uniform control card setup for both the 
system and in-house developed programs. 

C) The program package should be highly modular to permit experi- 
mentation. 

D) The program should provide support software for data handling. 

E) Programs should be easy to use and not require a lot of cumber* 
some vendor JCL statements. 

F) The software system must be reasonably efficient. This may be 
in terms of fast computational algorithms and/or data reduction 
schemes to reduce volume. 

3.2.2 Approach 


There were three systems available to us that could perform the require 
discriminant analysis. 


43 


The first package considered was SAS , (Statistical Analysis System). 

This system is written to run only an IBM 360/370 computer, and is dis- 
tributed in both load module and source form. Installation is as simple 
as creating a program library or adding numbers to an existing program 
library. Maintenance is minimal because the authors provide all neces- 
sary program support and send updated library tapes. 

The system allows the user to create his own procedures by modifying 
existing procedures or writing them from scratch. The SAS supervision 
provides software support such that all usual control card and data 
management features are available to the user. A user procedure is treat- 
ed exactly like a normal SAS procedure. 

In general, SAS is easy to use, and the SAS language permits almost unli- 
mited manipulation of data. However, the conversion of LANDSAT data tapes 
into SAS observations requires considerable programming because the SAS 
language has no simple provision to break up a line of data into a series 
of SAS observations. 

The original procedure DISCRIM, prints a line for every data point classi- 
fied. Clearly, this is too much output for an LANDSAT file. In addition, 
the procedure reads the entire data set twice, once to find the calibra- 
tion data, and once to classify. The procedure does not have the calibra- 
tion data, nor create a SAS file of classification results. 

Procedure DISCRIM was modified to create an in-house procedure that did 
not print the results for each point calibrated, but rather created a 
SAS compatible file that could be read in using the input processor. 

u Drs. Barr and Goodnight extended the features of the discriminant proce- 
dure in the following ways: 

1. Limit the printing of point by point classification results to 
desired levels and always print a summary. 

2. Accepted calibration and unknown data from separate files. 

3. Save and reuse the calibration results. 

4. Output the classified data as a SAS file for later analysis. 


1 / 

Developed at the Pennsylvania State University, Department of Forestry, 
by A. J. Barr and J. H. Goodnight. 


44 


1 / 

The second software package was the Penn State Classification System, 

This system was written in FORTRAN, and should have been easy to install. 
Some special input/output software has to be provided by the Penn State 
Computer Center. This special software was obtained from Penn State. 

One routine worked and one did not, but a substitute was found. The Penn 
State System does work now at the WCC. The point is that the Penn State 
System may not be completely transportable to other computing centers. 

The core programs use a common set of control cards which facilitates 
learning to use the programs. There are some related programs that were 
developed by other users that do not strictly adhere to the control card 
setup used by the main line programs. The maximum likelihood classifica- 
tion software is an example. 

In spite of the fact that the program is broken down into subroutines, 
it cannot be considered modular. There are many different subroutines 
called GETLIN that are used to retrieve lines of data from the file. 

Other critical subroutines share the same problem. 

In addition, these subroutines do not provide for complete file control. 
Therefore, any user defined program must partially process the input 
file in conjunction with some version of GETLIN. 

This non-modularity makes it difficult to modify or change the program. 

The package does not utilize a system monitor program to manipulate data 
files. One must use the standard vendor JCL to create and pass files 
between programs and runs. 

This system does use a data reduction scheme to speed up processing. 
Normally, an investigator is interested in only a portion of a LANDSAT 
image. The programs permit the user to subset the image and retain only 
the areas of interest. A table of contents record, preceedlngs, the file, 
permits the user access to any particular area as though he had the entire 
image. Unnecessary data is not processed, thus it is more efficient. 

The Penn State Classification System is really a collection of main level 
programs that can process a common file. The major programs are SUBERTS, 
SUBAIR, TPINFO, MERGE, NMAP, UMAP, STATS, ACLASS, ACLUS, DCLASS, and 
DCLUS. 


1 / 

Developed at the Pennsylvania State University, Department of Forestry, 
by Dr. F. Y. Borden and Associates. 


45 


SUBERTS and SUBAIR are used to reformat and subdivide LANDSAT and 
aircraft tapes into the Penn State format • 

MERGE is used to combine data from different passes into temporal 
overlays. 

TPINFO prints the heading and table of contents records from a 
standard file. 

NMAP assigns mapping symbols to all points of specified grey levels. 
It is used to prepare line printer maps. 

UMAP assigns mapping symbols based on contrast differences. It is 
also used to outline boundaries. 

STATS computes calibration statistics to be used by the classifica- 
tion programs. 

ACLASS performs a discriminant analysis of spectral signatures that 
have been normalized by reducing all data to a unit sphere. 

It is used to compensate for sun angle, and was developed 
for airborn scanners. 

ACLUS is an unsupervised cluster analysis program which uses the 
angular classification algorithm. 

DCLASS performs a Euclidian distance discriminant analysis of multi- 
spectral data. 

DCLUS is an unsupervised cluster analysis which uses the Euclidian 
distance algorithm. 

In addition to the above core programs, a maximum likelihood quadratic 
classification package was supplied as a related program. This program 
is not control card compatible with the core programs, but it uses the 
standard file. ^ 

The third software package considered was LARSYS III. The initial con- 
sideration was to install it in-house. The support group at LARS and 
our scientific monitors convinced us that this was beyond our means. 


Developed at Purdue University, Laboratory of Applications of Remote 
Sensing. 


46 


LARSYS is written for a different operating system than what we have at 
the USDA Washington Computer Center. Conversions would be expected to 
take several man years, and would require some systems level programmers. 

The staff at LARS has been very generous in providing both computer time 
and computer system personnel at various times for a period of two years. 

IV. Data Analysis - Objectives and Concepts 

In this section, the objectives and concepts relating to the LANDSAT 
investigation, both LANDSAT imagery and aircraft, are formulated. The 
results are presented by states, LANDSAT imagery first then aircraft 
photography. At the conclusion, ways to use the classification results 
to make acreage estimates and a method to combine data from aircraft and 
satellite is presented. 

4.1 Crop Classification 

4.1.1 Objectives 

1. Investigate the use of parametric discriminate functions. 

2. Estimate the rate of misclassification for each type of crop. 

Investigate the value of temporal overlays in reducing errors of 
misclassification. 

4. Determine differences in classification rates between states. 

5. Determine differences in classification rates between months within 
states. 

6. Evaluate the use of training data parameters from (a) one LANDSAT 
frame to another, and (b) in aerial photography from one flightline 
to another . 

7. Estimate the difference in classification results between dependent 
and independent data used in testing. 

4.1.2 Concepts 
Discriminant Analysis 

This background is intended to be general and enable the reader to under- 
stand the detailed computations and results that follow. Kendall and 
Stuart formulate Discriminant Analysis and Classification by stating... 


47 


"We shall be concerned with problems of differentiating between two or 
more populations on the basis of multivariate measurements... We are 
given the existence of two or more populations and a sample of individuals 
from each. The problem is to set up a rule, based on measurements from 
these individuals, which will enable us to allot some new individual to 
the correct population when we do not know from which it emanates." 1/ 

For example, the land population of interest was the Southwest Crop 
Reporting District (CRD) in Kansas. Wheat, sorghums, corn, oats, rye, 
and pasture are the major populations of Interest. From every acre in 
the CRD, we have light intensity readings for green light, red light, 
and two Infrared wavelengths. These light intensities are multivariate 
measurements that will be used to allot or classify each data point into 
a crop type such as com, wheat, or sorghums. A graphical representation 
of the above formulation would be as follows: 

Figure 8 — Conceptualized mapping from agricultural fields into measure- 
ment space. 



A sample of fields from each crop type is selected and their respective 
light intensities obtained. These sample points are plotted on a two- 
dimensional graph showing relative positions of each crop type in the 
Measurement Space (MS). The problem is to partition the measurement 
space in some optimal fashion so that points are allotted as nearly cor- 
rect as possible. Figure 9 shows the measurement space as it might be 
partitioned. 

Figure 9 — Partitioned measurement space. 



1 / 

M.G. Kendall and A. Stuart, The Advanced Theory of Statistics . 2nd Ed., 
Vol. 3, page 314. 


48 




Any point, no matter where it is in MS will be classified as one of the 
three crops. An unknown point where the number 1 is located in Figure 9 
will be classified as wheat because wheat is probably the group to which 
it belongs. Likewise, a point in position 2 would be classified as sor- 
ghum and a point in position 3 would be classified as corn. A point in 
position 4 would also be classified as wheat, but the probability that 
it is actually wheat is not as great as that of a point in position 1. 

There are many ways to partition a measurement space. We have done a 
simple non-statistlcal partition above, simply draw lines. Visually 
partitioning the measurement space may work when it is one or two dimen- 
sional, but for more than two dimensional measurement spaces, a visual 
partition is not possible. For most LANDSAT and aerial photography clas- 
sification studies a four dimensional measurement space has been used. 

The method used in this report was that of constructing contour "surfaces" 
in the MS. These dividing surfaces were constructed so that points fall- 
ing on the dividing surface have equal probabilities of being in either 
group on each side. Those points not on the dividing surface always have 
a greater probability of being classified into the crop for which the 
point is interior to the contour surface. If prior knowledge of the popu- 
lation density function indicates that the density is multivariate normal, 
then a multivariate normal density distribution will be estimated for each 
crop. It is hoped that the data is approximately multivariate normal 
since only the mean vector and covariance matrix is required to estimate 
a discriminant function. Usually small departures from normality will not 
invalidate the procedure, but certain types of departures (for example, 
blmodal data) may be very detrimental to the statistical technique. How- 
ever, the error rate and estimator properties are dependent on the assump- 
tions of the distributions and prior information. 

For example, in this study a multivariate normal density was assumed so it 
becomes quite simple to estimate the density functions and the discriminant 
scores which in turn determine boundaries. 


The discriminant score for ith population is: 


_ 3 . 
P ± (2.) 2 




(X-M,)' 


2 i (x-y i ) 


where 

h 

H 

X iJ 


is the prior probability for the ith crop 
is the covariance matrix (qxq) for the ith crop 
is the mean vector (q length) for the 1th crop 

is a set of measurements of an individual from the ith population. 
J * i or j - i 


49 


or its equivalent discriminant score the log^ of S ± - 

-1 

log e (P ± ) - 1/2 log e \t ± \ - 1/2 (X-Wj) (V"^) 

The boundary between two populations is quadratic (curved) and the point 
X that fall in the boundary have an equal probability of being in either 
population. 

When an unknown land point is classified, its measurement vector is com- 
pared to the mean vector for each crop represented. The point is assigned 
to the crop whose mean point is "nearest" from a statistical point. 

The procedure used for finding the "nearest" mean uses the Mahalanobis 
measure of distance, not the Euclidean. This is illustrated in Figure . 

Figure 10 — Measurement Space showing two crop density functions and an 
unknown point (x) • 



The point x is actually closest (Euclidean distance) to the mean vector 
(center point) of B. However, when one takes into account the variance 
and covariances, x 1® found to be closest to Group A based on a probabi- 
lity concept and an outlier of Group B. Therefore, the point would be 
classified into Group A, because the probability that the point (x) 18 a 
member of Group A is much greater than for Group B. 

So the partitioning of the MS is done by computing the means for each 
crop type and using the Mahalanobis distances from this mean. This 
distance depends on the covariance matrix and is a measure of probabi- 
lity. The discriminant functions without prior probabilities are: 

1) (X - X t r S -1 (X - X ± ), which is a simple estimate of 

(X - (X’Hj) if linear discriminant functions are 

used, and 


50 


-1 

2) -1/2 log c | Z ± | -1/2 (X - X^)" S^ (X - X^) If quadratic discri- 
minant functions are used. These functions are the exponents of 

the density formula of the multivariate normal distribution C 

-1 exp 

1/2 (X - p^) (X - p^) depending on the i'th crop. If t ■ 

for all i»*j linear discriminant functions are used. 

It is worth pointing out that if linear discriminant functions are used, 
one assumes (1) that t ± * Zj and (2) that for all crops in the MS the 

major and minor axes are equal, and (3) the sample data of each crop has 
the same slope. Such an event in two-space is shown in Figure 11. 

Figure 11 — Measurement Space where crop types have same covariance matrix. 



This space can be partitioned effectively with straight lines thus we can 
use linear discriminant functions. 


Figure 12 shows a MS where covariance matrices are not equal, and there- 
fore, linear discriminant functions are not appropriate. In either case, 
the Mahalanobis distance is used. 

Figure 12 — Measurement Space when crops have different covariance matrices. 



In Figure 11, even though a common center point is not present, a common 
covariance (ellipse) matrix would be computed. In Figure 12 a different 
covariance matrix will be needed for each crop type. When the off-diago- 
nal elements in the covariance matrix are unequal, the slopes of the data 
are different and linear discriminant functions are not appropriate. 


51 


I 


i 


I 


[ 


1 7 


1 1 .T. 1 I. 


The above techniques follow from our first assumption that the data is 
normally distributed in the MS. In practice, however, one does not 
decide what the distribution of the population density is in MS and 
program the correct procedure. One uses the available procedures for 
analyzing data. Most available programs assume multivariate normal data 
because the program and the calculations are greatly simplified. Thus, 
it becomes necessary to justify the use of these simplified programs. 

In order to explain better how a parametric procedure can reduce the 
work load, consider that the first step in the discriminant analysis (DA) 
is to estimate the population density function in the MS, with a sample 
of. points from each crop. Once these population density functions have 
been estimated, then partitioning the space is extremely simple. 

To estimate a multivariate population density in MS for corn where we 
have no prior information except sample data on corn is extremely diffi- 
cult. If a sample of 1000 points was available, each of these 1000 data 
points would need to be stored in the computer. On the other hand, if we 
are working with a multi— dimensional normal distribution, theory tells 
us that the sufficient statistics are computed (mean vector variance 
matrix) and stored in the computer. 

The individual data points could be discarded because no additional infor- 
mation about the population distribution in the MS is available in these 
points. (There would be information about how well the data fits the 
normal distribution in these 1000 data points). 

Another consideration is that all the techniques we have described 
require independent random samples from each crop in order to estimate 
the population density in the MS (training data) . This point is mention- 
ed because most remote sensing analysts do not work with randomly selected 
points. In this study we have tried to work with randomly selected fields. 
However, the points within these fields are not a random sample of all 
possible points in a given crop, but the data are nested within fields. 
Consequently, the random selection is restricted to the selection of 
fields within the randomly selected segments. 

One type of prior information that can be used in the classification pro- 
cedure is the relative frequency of occurrence (prior probabilities) for 
each of the K populations in the total land population. For example, if 
1/3 of all land is wheat, and 1/3 is pasture as it might be in parts of 
Kansas, this information would be used and it would effect the partition- 
ing of the measurement space accordingly. If a crop has a high chance of 
selection, then the area in the MS would be increased. Conversely, if a 
certain crop has a very low chance of occurrence, then the area in MS 
would be adjusted downwards. 



REPRODUCIBIIJ.l v OF THE 
ORIGINAL PAGE IB x 


52 


One last point to be covered on procedures used would be to define what 
Is meant by thresholding. Suppose some unknown crop for which there is 
no sample in the original data set is to be classified. With the present 

nrnhaJvf "Mt- 6 be clas3lfied as Crop A, B, or C, depending on its 

probability of being in either A, B, or C. For example, in Figure 13, 

*001 hC P it IX) th3t the point x was Cr °P A is * 01 and P(BIX) = 

.001, and P(CIX) = .02 the point x would be classified as belonging to 

5n°Sc ^ 6V e n th ^ Ugh !! he P robablllt y is only .02. It would be an outlier 

fled f ° r Cr ° P C> ^ therefore> we want to let it remain unclassi- 


Figure 13-Measurement Space showing an outlier and three crop areas with 
95% confidence limits. 



4 -!- 3 Description of LANDSAT Data 

^MQc? a ^ e 3 lite J d ! ta J USed in th±S re P° rt ls LANDSAT Multi-Spectral Scanner 
(MSS) data and is described in Section 3 of Data User's Handbook. 1/ 

The MSS is a passive electro-optical system that can record radiant 
energy from the scene being sensed. All energy coming to earth from the 

bv^oblects 6r acattered * or absorbed and, subsequently, emitted 

by objects on earth - The total radiance from an object is composed of 

two components, reflected radiance and emitted radiance. In general, the 

obWt ? radlance a dominant portion of the total radiance from an 

object at shorter wavelengths of the electromagnetic spectrum, while the 
emissive radiance becomes greater at the longer wavelengths. The combi- 
nation of these two sources of energy would represent the total spectral 

° b J eC 5;« Th±s * then ’ ls the "spectral signature" of an 
object and it is the differences between such signatures which allows 

iCatl ™ °f ° bJeots Using the statistical techniques just dis- 
s d. The particular product is system corrected images refers to 


1 / 


Published by Goddard Space Flight Center. 

2 / 

Baker, J.R. and E.M. Mikhail, Geometric Analysis and Restitution of 
D igital Multispectral Sc anner Data Arrays . LARS information note 052875. 


53 


,y 


products that contain the radiometric and Initial spatial corrections 
introduced during the film conversion. Every picture element (pixel) is 
recorded with 4 variables - each variable corresponds to one. of the 4 MSS 
bands. Table 23 shows the relationship between the MSS bands and light 
wavelengths . 


Table 23 — Sensor spectral band relationships. 


Sensor 

Spectral Band 
Number 

Wavelenghts 

(micrometers) 

Color 

Band Code 

MSS 

1 

.5 - .6 

Green 

4 

MSS 

2 

.6 - .7 

Red 

5 

MSS 

3 

00 

• 

1 

• 

Near Infrared 

6 

MSS 

4 

.8 - 1.1 

Infrared 

7 


The numbers are similar to transmission values - zero radiances at Step 
15 which is black on positives and maximum radiance at Step 1 which is 
white on positives. The radiance varies linearly with gray scale stop 
transmission between these values with the difference between each step 
corresponding to l/14th of the maximum radiance. The recording format 
in the CCT is 8 bits, the sensor range is 7 bits, and the actual dynamic 
range of usable data is between 5 and 6 bits. 

The analysis was started by first locating the test and training data 
(ground observations with either the Penn State University program (NMAP) 
or an in-house program (RADMAP) that produces gray scale maps. 1/ After 
the ground enumeration information was located on LANDSAT CCT's, rectangu- 
lar areas within fields were located and punched using the LARS field • 
description card format. Once these cards were obtained and checked, the 
statistics function in LARSYS was employed to extract univariate graphs 
to detect blmodal classes. 

. In most cases, analysis proceeded from, the statistics program to the 
Program for classify points, but with the introduction of a feature to 
use prior probabilities. These classifications were stored on tape by 
file number so the print results function could be run more than once. 

4.1.4 Results 

The results will be presented by state since there was a slightly different 
situation in each state. All LANDSAT analysis is presented first then 
the aircraft follows. 


1 / 

See Section - Segment Location 


ORIGINAL 


IS FUuR 


54 


Missouri LANDSAT: 


The Crop Reporting District (CRD) that was the test site was in the south- 
east corner of the state. This area is outlined in black on the map of 
Missouri, Figure 3. 

Summary of Results 

The Missouri test site covers 4,660 square miles. There are 50 segments, 
each about a mile square. These segments constitute a random sample 
from all land areas. The ground enumeration was taken from these seg- 
ments. This information was used for both training and testing. 

Analysis of Missouri data was done using a tape that was assembled at 
LARS. The data for three dates, August 26, September 13, and October 21, 
1972, were geometrically corrected then overlayed to create a tape with 
temporal data. Therefore, data used for analysis from three different 
times in the growing season was available and covered an area that con- 
tained 29 of the JES segments in this CRD. The principle results are 
summarized below: 

1. A test was run on the covariance matrices between crops to see 

if they were equal. The results of this test were that they very 
likely were not equal. Thus, linear discriminant functions 
seemed inappropriate. 

2. Best overall correct classification rate was 70%. This included 
using temporal overlays and using unequal prior probabilities. 


/ 


3. Unequal prior probabilities for crops improved classification 
results by 10% over using the assumption of equal probabilities 
for crops. 

4. The temporal data improved the classification by 10% even though 
the dates were not optimum. 

5. One classification was run on data to estimate the effect of inde- 
pendent data. The difference was 9%, and was an over-estimate. 

Data Analysis - LANDSAT 

In the analysis, the equality of the covariance matrices was checked 
first because this is essential for the linear discriminant analysis 
assumptions to be valid. A test presented in Morrison’s Multivariate 
Statistical Methods, page 152, was used to test the within crop covariance 
of LANDSAT data. This test is not robust with respect to certain depar- 
tures from normality. 


55 


For the following example, August 26, 1972 imagery bands 4, 5, and 7 were 
used. The covariance matrices for cotton, soybeans, and grass were tested. 
The test was conducted as follows. The null hypothesis states that the 
covariance matrices are equal. 


V E 1 • E 2 ■ E 3 


The alternative hypothesis is: 


H 


1 ' 


h* 


for some i/ j 


is an estimate of 


based on m_^ 


degrees of freedom where 1 is a crop. 



6.76 

7.01298 

.4914 

S cotton * 

7.01298 

11.0889 

-5.6643 


4914 

-5.6643 

39.69 _ 


IT. 6049 

8.3623 

.826T 

S soybeans * 

8.3623 

13.9876 

-6.3146 


_.8265 

-6.3398 

64.6416 


T.6169 

5.8416 

.7525 

S grass = 

5.8416 

9.7344 

-6.3398 


_. 7525 

-6.3398 

40. 3225_ 

Now we form the 

pooled 

estimate of %. 


k 

6.5567 

7.4436 

.663? 

5 ■ w “A 

7.4436 

12.1519 

-6.0189 

HI 

.6638 

-6.0189 

50.2976 


The statistic for the modified likelihood - ratio test is: 

k 

M - m In |s| - l m In |S | 

i«l 1 1 


= 149.25 


56 


.00678 


Next, we form the scale factor: 

-1 _ 2P 2 + 3P - 1 * 1 1 

1 " 6(p+l) (k-1) ±ml m jL 


and MC ^ Is distributed approximately chi-squared with degrees of freedom 
1/2 (K— 1) p(p+l) as m ± tends to infinity if H Q is true. 

MC _1 = .48.77 d.f. = 12 ** .05 y 2 (12 « - .05) ■ 22.36 

Thus, we must reject the null hypothesis i.e. the data does not support 
the assumption that the covariance matrices are equal. 

Therefore, the necessary assumptions for valid linear discriminant analy- 
sis are not met and better results might be attained by using quadratic 
discriminant functions. Generally, we used the quadratic approach on 
our analysis. However, it should be pointed out that upon close examina- 
tion, the covariance matrices are very similar in many respects. Corre- 
sponding elements in the three covariance matrices are of at least the 
same order of magnitude and have the same sign. Under such conditions, 
it is possible to get acceptable results from a linear approach. 

Conclusions of similar tests for the September 14, 1972 data were the 
same, the covariance matrices were unequal. 

Results of the discriminant analysis (DA) are presented in a classifica- 
tion matrix (CM). Table 24 is an example of a CM using quadratic discri- 
minant functions with unequal prior probabilities. The prior probabilities 
came from the June Survey early in the season. That is, it was not assumed 
that corn, cotton, soybeans, grass, and other all have the same probability 
of occurrence. The classification parameters were obtained from the same 
data that was used in the testing phase. 


Although 12 bands were available, since three dates were involved, only 
nine were used in this study because three were of poor quality. There 
were two consecutive LANDSAT images that contained 29 segments. All data 
was used both to partition the measurement space (MS) and test the parti- 
tion. The CM will be biased upward because data was used for both pur- 
poses, however, this bias should be small if ample data are available. 


1 


1 


Table 24 — Classification matrix of quadratic discriminant functions with 

unequal prior probabilities using data from three overflightsi/, 
Missouri Study Area. 


Group 

:No. of 
: sample 
: point 8 

* Percent 


Number 

of samples classified into t 

) Correct 

Cotton 

: Corn : 

Soybean : 

Grans 

: Miscellaneous 

Cotton. . 

..: 927 

79.7 

739 

2 

137 

26 

23 

Corn. . . . 

.. : 58 

44.8 

9 

26 

7 

14 

1 

Soybean. 

..: 852 

71.8 

99 

12 

612 

96 

23 

Grass. . . 

..: 240 

53.3 

42 

1 

66 

128 

4 

Misc. . . . 

..: 140 

89.3 

17 

2 

44 

13 

64 

Totals . . 

. . :2217 


906 

43 

866 

277 

125 




Overall 

performance 70.8 percent 






1 / 

August 26, 1972, MSS bands 4, 5, 7 
September 14, 1972, MSS bands 5, 7 
October 2, 1972, MSS bands 4, 5, 6, 7 

The leftmost column in Table 24 identifies the crop - cotton, corn, soy- 
beans, grass, and miscellaneous. The next column gives the number of 
sample values in each of the crop classes. For example, there are 927 
pixels to be classified. The next column tells the percent of these that 
were classified correctly as cotton (79.7%). The rest of the columns 
give the number of these pixels that were classified into each crop class^ 
i.e. 739 were classified correctly as cotton, while the remainder were 
misclassified as follows: 2 of the 927 as corn, 137 as soybeans, 26 as 

grass, and 23 as miscellaneous. The overall performance in this table, 
was 70.8 percent. To compute this figure, the correctly classified pixels 
were divided (the diagonal elements - 1569) by the total pixels 2217. 

The prior probabilities used in this study were based on a statistical 
sampling of the entire land area. Data that is collected in this way 
enables the user to estimate the prior probability and take advantage 
of this procedure. Historic data could be used, but they are more dif- 
ficult to justify when important changes between years are occurring. 

The next table is the same as the last, except that equal prior probabi- 
lities were used. 


f 


58 



Table 25- Classification matrix of quadratic discriminant functions with 
equal prior probabilities using data from three overflights 1/, 
Missouri Study Area. r ™ 


Group 

N °‘ ° f : Percent 1 Number of samples classified into 

samp 1 p ; •- ■ i i - 

points ; ^' orrec ^.^°* : * :on : Corn : Soybean: Grass j Miscellaneous 

Cotton. . . . 
Corn 

927 74.3 689 21 83 36 98 

58 58.6 4 34 3 10 7 

852 39.7 101 49 338 137 227 

240 57.1 34 22 22 138 25 

140 75.0 14 5 7 9 105 

2217 842 131 453 329 462 

Soybean. . . 
Grass. .... 

Misc 

Total 


Overall performance 58.8 percent 


1 / 

August 26, 1972, MSS bands 4, 5, 7 

September 14, 1972, MSS bands 5, 7 

October 2, 1972, MSS bands 4, 5, 6, 7 

Most classifications done so far by other remote sensing analysts have 
used this assumption that the crop classes are all equally likely to 
occur. Most people feel this assumption is riot detrimental, however, 
this example illustrates that it can make a difference. Especially, if 
acreage for the crop classes does vary vastly or when crops are hard to 
distinguish. Two properties are worth noting, classification results, 
and the statistical properties are much better in Table 24 than in Table 
25. For example, in Table 24 the total number of pixels classified as 
cotton is 906, compared to the actual number of 927. In Table 25, the 
number of cotton pixels is 842. 

A similar comparison is even more drastic with soybeans. In Table 24, 

866 pixels were classified as soybeans while 842 actual points were soy- 
beans. In Table 25, there were 453 points classified as soybeans. Fur- 
ther, the statistical properties of the estimates are better since if the 
data is normal, and the prior probabilities are correct, we obtain 
unbiased estimates of crop categories and we can estimate the Bayes error 
rates (minimum error rates) using the classification. 


.r'* 




59 



1 


A chi-square test for discriminatory power was run on the CM of Table 
24 and 25. 1/ The null hypothesis is that the classification was done 
strickly at random. If the null hypothesis is correct, then the spectral 
information was useless as far as giving information that would help 
assign the data to a crop class. If the above hypothesis is correct, 

(n-e) 2 + (n-e) 2 has a chi-square distribution with 1 

then the statistic 

e e 

degree of freedom. Where n and n are the number_of correctly classified 
and misclassified points respectively and e and e are the expected number 
of correctly classified and misclassified points under the null hypothe- 
sis. 


The chi-square for Table 24 is 4626 and for Table 25 is 2782. These chi- 
square values with one degree of freedom are highly significant, and 
therefore, we conclude that the classification was not done at random. 
Another chi-square test based on the difference between the marginal sums 
and the correct number of data points in each class for Table 25 is as 
follows : 

2 (9Q6-927) 2 +(43-58) 2 +(866-852) 2 +(277-240) 2 +(l40-125) 2 - 

X (5) * 927 58 852 240 125 

.47 + 3.87 + .23 + 5.70 + 1.61 * 11.89 


This chi-square statistic is similar to the one before, except that there 

k (n-e) 2 

are 4 degrees of freedom. £ where n and e have the same mean- 

i“l e i 


ing as before. 

This chi-square value of 11.89 is significant, and therefore, the 
hypothesis that the marginal totals in Table 24 are estimating the actual 
row totals is rejected. Note that the components for grass and corn are 
the major contributors to the significant chi-square. 

The authors know of no statistical test that compare one C.M. with another 
C.M. , but there are two criteria that can be used to help evaluate a cer- 
tain C.M. The first criterion simply assigns each misclassified point a 
loss of 1 and each correctly classified point as loss of 0. Under this 
criterion, Table 24 has a loss value of 648 and Table 25 has a loss value 
of 914. This criterion is crude, but it seems reasonable for our purposes 
to give a misclassified corn pixel the same weight as the misclassified 
cotton pixel. 


1 / 

S. James Press, Applied Multivariate Analysis , pages 381-383. 


60 


^ "f H M rlt r i0n 18 ? b±t m ° re SUbtle - Ifc uses the marginal totals 
in the C.M. For example, in Table 24 the column sum for cotton is 906. 

his means that 906 pixels were classified as cotton. Actually, there 
were 927 cotton pixels. In Table 25, there were 842 pixels classified 
into the cotton group. This is not close to the correct number of 927. 

? T e ^ tim S e ( 9 u° 6) fr ° ra Table 24 18 with±n 2 Percent ° f the 
aC £“ al * In Table 25 » the marginal estimate of 842 or within 9 percent. 

value 26 preSentS these esti mates along with the percentages of the true 
Table 26— Marginal estimate and difference from actual values. 


Group 


! Actual 1 D „ ^equal 

: Prior Probabilities 

Equal 

Prior Probabilities 

• • Estimate ^Difference :Percent 

Estimate :Dif f erence t Percent 


Cotton. . 
Corn . . . . 
Soybean. 
Grass. . . 
Winter 
Wheat. . . 
Odd 


927 

58 

852 

240 

85 

55 


906 

43 

866 

277 

27 

98 


21 

15 

14 

37 

27 

43 


2.2 

25.9 

1.6 

15.4 

68.2 

78.2 


842 

131 

453 

329 

346 

116 


85 

73 

399 

89 

261 

61 


9.2 

125.9 
46.8 
37.1 

307.1 

110.9 


In every case unequal prior probabilities were superior to the equal 
prior probabilities model and in some cases, substantially so. For 
exampie, the number of corn pixels for Table 25 was 131 or 125.9 percent 

24 is 6 43 ^ th % ac 5 ual 58 * The au mber of corn pixels for Table 

43 or 25.9 percent of the difference from the actual 58 pixels Sov- 

ltem > al8 ° Sh ° ws a significant improvement over 
prior ^Drobab a model. ActuaHy , the soybean estimate for the equal 

prior probability model was 46.8 percent which the estimate for the 
unequal prior probability model was 1.6 percent. 

N f Xt \^ e p ° int classification systems were compared to the per-field 
classification scheme. Table 27 presents the C.M. for the per-field 

flpl S H if±er k SyStein * With 3 P ° lnt class i f ication system, each point in a 

flasslf^r aaSl8 ? ed to any of the crop categories. With the sample 
ssifier, aH points in the field are assigned to the same crop class 

that wer^n t tD S** P ^ OCedure ls that there were a large number of fields 
The Mrbnin ass± S ned to a crop because the data set was not large enough. 
Si dSSSr/ eq 3 covariance matrix to be inverted and therefore, 

S P ' * f ara re< ^ uired (where p is the number of variables) . How- 

b::n , foL en L Ue L P :SSl:nt. PreSent, ClaSSiflcation performance has generally 


61 


In the work done In Missouri using the sample classifier, about 40 percent 
of the fields were not classified because the required number of points 
for the classifier (10 is this particular case) exceeded the number of 
points present within the defined fields. Of the total number of fields, 
32.9 percent were correctly identified. Considering only those fields 
which were classified, 54 percent were classified correctly. 

Table 27 — Per-field classification matrix based on data from 3 over- 
flights. 1/ 


:No. of 

Group : fields 

• 

• 

: Percent : 
: fields : 
: correct : 

No. of : 

samples: COTTON 

• 

• 

CORN 

Ms* 

GRASS :MISC. 

NOT CLASSIFIED 

Cotton: 

38 

63.2 

927 

24 

0 

2 

0 

1 

11 

Corn. . : 

7 

14.3 

58 

0 

1 

0 

1 

1 

4 

Soy- : 










beans. : 

58 

25,9 

852 

9 

3 

15 

3 

8 

20 

Grass. : 

31 

9.7 

240 

3 

1 

1 

3 

2 

21 

Misc. . : 

9 

44.4 

140 

1 

0 

1 

1 

4 

2 

Totals: 

143 

32.9 

2217 

37 

5 

19 

8 

16 

58 

1/ 










August 

26, 

1972, MSS 

bands 4, 

5, 7 






September 14, 1972, 

MSS bands 

5, 7 






October 

2, 

1972, MSS 

bands 4, 

5, 6, 

7 






Temporal Overlay 

The next analysis Investigated the value of a temporal overlay of the 
three LANDSAT passes. This particular data set was a temporal overlay 
of three LANDSAT passes. Each pass could also be compared with the three 
passes. However, there were 3 bad bands in the total of 12. Two poor 
quality bands were in the September 14 imagery and one poor quality band 
was in the August 26 imagery. This makes it difficult to compare the 
three dates since the number of bands were confounded with dates. Never- 
theless, the C.M. ' s for each date are presented in Tables 28, 29, and 30. 
These tables can be compared to the 9 band-overlay of Table 24 since they 
are all unequal prior probability models. 


62 


Table 28 — Classification matrix using August 26, 1972, MSS bands 4, 5, and 
7 with unequal prior probabilities. 


:No. of: Percent: 


Number of samples classified into 


Cotton. . . 

Corn 

Soybean. . 
Grass. . . . 

Misc 

Totals. . . 


sample : 
points : 

Correct 

| Cotton ! 
• • 

Corn 

: Soybean: 

Grass: 

Miscellaneous 

927 

60.6 

562 

1 

311 

22 

31 

58 

10.3 

12 

6 

30 

2 

8 

852 

86.0 

70 

2 

733 

29 

18 

240 

8.3 

42 

7 

167 

20 

3 

140 

31.4 

9 

3 

76 

8 

44 

2217 


696 

19 

1317 

81 

104 


Overall performance 61.5 percent 


Table 29 — Classification matrix using September 13, 1972, MSS bands 5 and 
7 with unequal prior probabilities. 


Group 


:No. °^ : p ercent : Number of samples classified into 

: sample t— — — . — _ 

: points ; orrect .Cotton : Corn : Soybean: Grass : Miscellaneous 


Cotton. . . 

Corn 

Soybean. . 
Grass. . . . 
Misc 

Totals. . . 


927 

69.7 

646 

0 

246 

58 

0.0 

12 

0 

16 

852 

67.6 

175 

1 

576 

240 

42.1 

40 

0 

97 

140 

22.8 

14 

2 

82 

2217 


887 

3 

1017 


Overall performance 61.0 percent 



Table 30 — Classification matrix using October 2, 1972, MSS bands 4, 5, 6, 
and 7 with unequal prior probabilities. 


Group 

No. of 
sample 
points 

[ Percent 
"Correct 


Number of 

samples classified into 

[Cotton : 

Corn 

: Soybean: Grass 
• • 

: Miscellaneous 

Cotton. . . 

927 

73.2 

679 

6 

161 

59 

22 

(lorn ..... 

58 

12.1 

30 

7 

14 

1 

6 

Soybean. . 

852 

62.4 

200 

7 

532 

76 

37 

Grass, . . . 

240 

27.9 

83 

0 

89 

67 

1 

Mi so ..... 

140 

17.9 

30 

1 

73 

11 

25 

Totals. . . 

2217 


1022 

21 

869 

214 

91 

Overall performance 59.1 

percent 






Table 31 summarizes these three classification matrices in 1 table. 

Table 31— Comparison of multitemporal classification performance to classi- 
fication of single dates. 1/ Missouri Study Area. 


Croup 

Multitemporal 

Aug. 26 

Sept. 14 

Oct. 2 

Cotton 

29.7 

60.6 

69.7 

73.2 

Corn 

44.8 

10.3 

0.0 

12.1 

Soybeans 

71.8 

86.0 

67.6 

62.4 

Grass 

53.3 

8.3 

42.1 

27.9 

Misc. 

89.3 

31.4 

22.8 

17.9 

Overall 

70.8 

61.6 

61.1 

59.2 


1 / 

Unequal prior probabilities were used for all classification. 


ttuT»V t tPY QF TUB 


64 




The same classifications were run for all dates individually except that 
equal prior probabilities were used. 

Table 32 — Classification matrix for August 26, 1972, based on MSS bands 
4, 5, and 7 using equal prior probabilities. 


Group 

No . of . p ercent 

sample : Correct 

points: 


Number of 

samples 

classifed into 

Cotton : 

Corn 

: Soybean 

: Grass 
• 

• 

: Miscellaneous 
• 

Cotton. . . 

927 60.7 

563 

92 

108 

63 

101 

Com 

58 56.9 

2 

33 

0 

7 

16 

Soybean. . 

852 15.3 

57 

72 

130 

245 

348 

Grass. . . . 

240 45.4 

32 

41 

26 

109 

32 

Misc* .... 

140 62.9 

11 

10 

13 

18 

88 

Totals. . . 

2217 

665 

248 

277 

442 

585 

Overall performance 41.6 

percent 






Table 33 — Classification matrix for September 13, 1972 based on MSS bands 
5 and 7 using equal prior probabilities. 


Group 

No. of 
sample 
points 

| Percent 
| Correct 


Number of samples classified into 

Cotton : 

Corn : 

Soybean: 

Grass: 

Miscellaneous 

Cotton. . . 

927 

60.7 

563 

92 

108 

63 

101 

Corn. .... 

58 

56.9 

2 

33 

0 

7 

16 

Soybean. . 

952 

15.3 

57 

72 

130 

245 

348 

Grass. . . . 

240 

45.4 

32 

41 

26 

109 

32 

Mi nr .... . 

140 

62.9 

11 

10 

13 

18 

88 

Totals. . . 

2217 


665 

248 

277 

422 

585 

Overall pefformance 50.8 

percent 








Table 34 — Classification matrix for October 2, 1972 based on MSS bands 
4, 5, 6, and 7 using equal prior probabilities. 


Group 

No. of 
sample 
points 

Percent 

Correct 


Number of samples classified into 

Cotton 

: Corn 

: Soybean: 
• • 

Grass 

: Miscellaneous 
• 

Cotton. . . 

927 

66.7 

618 

35 

30 

149 

95 

Corn 

58 

37.9 

21 

22 

4 

4 

7 

Soybean. . 

952 

20.8 

142 

46 

177 

141 

346 

Grass. . . . 

240 

42.5 

58 

9 

23 

102 

48 

Mi sr.. .... 

140 

60.7 

20 

a 

a 

i ft 

oe 

Totals. . . J 

2217 

860 

120 

242 

414 

03 

581 


Overall performance 45.3 percent 


Table 35 summarizes these tables. 


Table 35 — Comparison of multitemporal classification performance to 

classifications of single dates using equal prior probabili- 
ties. 1 / Missouri Study Area. 


Group 

Multitemporal 

Aug. 26 

Sept. 13 

Oct. 2 

Cotton 

74.3 

60.7 

71.4 

66.2 

Com 

58.6 

56.9 

34.5 

37.9 

Soybeans 

39.7 

15.3 

28.9 

20.8 

Grass 

57.1 

45.4 

44.6 

42.5 

Misc. 

75.0 

62.9 

65.7 

60.7 

Overall 

58.8 

41.6 

50.8 

45.3 



The temporal overlay classification of Table 25 shows an overall perfor- 
mance of 58.8 percent as compared to 41.6 percent, 50.8 percent, and 45.3 
percent, respectively, for Tables 32, 33, 34. Based on these comparisons, 
the temporal overlay does improve the classification. However, the eval- 
uation can become more difficult to interpret in the temporal overlay 
tapes because of changes in land use from one date to the next. Thus, 
the time of year becomes very important in areas where double-cropping is 
common or preparation of land follows each crop. It should be pointed 
out that these dates were not optimal. Other dates would have given dif- 
ferent results. 

Independent Test Data 

The last exercise wa9 completed to estimate the C.M. in Missouri on inde- 
pendent data. Since the number of fields and points within are small and 
the area covered is large, we need more training data to represent the 
total area. It did not seem possible to divide the set into halves and 
still have enough training data. It was decided to use a jacknife pro- 
cedure. This procedure has the advantage of giving unbiased estimates 
that are simple to calculate. The data were divided into three equal sub- 
groups, two groups were used to train with and the third group was used 
as a test group. This was repeated three times, each time with a different 
group used as test data. These three tables are presented separately, then 
the three are combined and presented to give an unbiased estimate of the 
classification matrix where independent test data is used. By using 
independent data, it is hoped that the bias caused by using the same data 
for both training and testing would be eliminated, but the variance of 
each item in the latter tables may be somewhat higher than those in the 
previous tables since a smaller data set was used. 

One cotton field of 27 points was not inclined in any of the three groups. 
So the total in Table 39 is 27 pixels smellier than the total of earlier 
tables. Table 39 is the matrix sum of Tables 36, 37, and 38. 


Table 36 — Classification matrix using August 26, 1972, MSS bands 4, 5, and 
7 with subgroups 2 and 3 as training data and subgroup 1 as 
test data. 


Group 

No. of: 

sample 

points 

Percent * 


Number of samples 

classified into 

Correct ‘Cotton : 

Corn 

: Soybean: 

Grass 

: Miscellaneous 

• ------ - - - - 

Cotton 

479 

56.2 

269 

11 

129 

36 

34 

Soybean. . 

138 

45.7 

35 

6 

63 

17 

17 

Grass 

66 

34.8 

15 

7 

15 

23 

6 

Misc. 

68 

16.2 

1 

4 

39 

13 

11 

Totals 

751 


320 

28 

246 

89 

68 

Overall performance 48.7 percent 




Table 37 — Classification matrix using August 26, 1972 MSS bands 4, 5, and 
7 with subgroups 1 and 3 as training data and subgroup 2 as - 
test data. 



No. of: 0 

. .Percent 


Number of samples 

classified into 


^P; 6 * Corr ect 
points: 

Cotton : 
• 

Corn 

: Soybean: 
• • 

Grass 

: Miscellaneous 

Cotton. . . 

290 57.6 

167 

36 

11 

19 

57 

Corn. .... 

29 13.8 

1 

4 

o 

8 

16 

Soybean. . 

308 13.0 

48 

53 

40 

20 

147 

Grass. . « . 

42 28.6 

1 

11 

4 

12 

14 

Ml so ....... 

57 78.9 

0 

2 

8 

2 

45 

Totals. . . 

726 

217 

106 

64 

63 

' 

279 

Overall performance 36.9 

percent 






Table 38 — Classification matrix using August 26, 1972 MSS bands 4, 5, and 
7 with subgroups 1 and 2 as training data and subgroup 3 as 
test data. 


Number of samples classified into 

Cotton : Corn : Soybean: Grass : Miscellaneous 

„ • • • • 


Cotton...: 131 47.3 62 22 1 22 24 

Corn : 29 41.4 3 12 2 5 7 

Soybean. . : 406 200 6298 137 226 

Grass....: 132 43.2 20 27 0 57 28 

Mist.....: 15 0.0 5 20 8 0 

Totals...: 713 96 92 11 229 285 


Group 


wo. or: 
sample : 
points: 


Percent 

Correct 


Overall performance 19.5 percent 




Table 39— Classification matrix combining Tables 36, 37, and 38. 


Croup 

No. of 
sample 
points 

[Percent [_ 


Number of samples 

classified into 

.Correct, Cotton 
• • 

: Corn : 

Soybean : 

Grass 

• Miscellaneous 

Cotton. . . 

900 

55.3 

498 


141 

77 

115 

Co rn • • • • • 

58 

27.6 

4 

16 

2 

13 

23 

Soybean. . 

852 

13.0 

89 

88 

111 

174 

390 

Grass. . . . 

240 

28.3 

36 

45 

19 

92 

48 

Misc 

140 

40.0 

6 

8 

47 

23 

56 

Totals. . . 

2190 


633 

226 

320 

379 

632 


Overall performance 34.6 percent 


The comparable classification where non-independent data was used is 


Table 40 Classification matrix using August 26, 1972, MSS bands 4 5 

and 7. * * 


Group 

No. of 
sample 
points 

Percent 

Correct 


Number of samples classified into 

[CY,tton 

: Corn : 

Soybean: Grass 

: Miscellaneous 

Cotton. . . 

927 

60.7 

563 

92 

108 

63 

101 

Corn 

58 

56.9 

2 

33 

0 

7 

16 

Soybean. . 

852 

15.3 

57 

72 

130 

245 

348 

Grass. . . . 

240 

45.4 

32 

41 

26 

109 

32 

Misc 

140 

93.6 

11 

10 

13 

18 

131 

Totals. • . i 

2217 


665 

248 

277 

442 

585 


Overall performance 43.6 percent 



Anytime the results differ this much between data sets, we know the data 
set is either too small or the bias is large. Obviously, we have not 
reached tjt£> point where we have covergence of parameters based on inde- 
pendent and non-independent data sets. The sample sizes necessary depends 
on the variation in the data set and the variation in the data set is 
generally a function of how dispersed the data really is. One thing is 
certain with a small data set, either procedure may lead to erroneous 
conclusions. 

Kansas : 

The LANDSAT analysis was done on the CRD in the southwest corner of the 
State. Figure 2 shows the State of Kansas with the study area outlined. 

Analysis of Kansas LANDSAT Data 

The objective of the analysis of Kansas LANDSAT data were the following: 

1. Test the covariance matrices of the most important crops to see 
if they were equal. 

2. Compute the classification rates for the Kansas test site. 

3. Compute the correlation coefficients between ground observation 
acreage and classified pixels. 

4. Study the effect of classification in one LANDSAT frame using 
training parameters from an adjoining pass taken one day apart. 

5. Study the classification of a Kansas county. 

Approach : 

1. LANDSAT imagery for the study area was too cloudy to be useful, prior 
to September 21, 1972. The study was based on September 21 and 22 
imagery. The area of interest in Kansas was divided by two LANDSAT 
passes, thus the training data was also divided. Twenty-two segments 
were in the September 21 imagery. Seven of these segments were hid- 
den by clouds. Therefore, 15 segments were used as training and test 
data. 

Since the time of year was not conducive to optimal results, a visual 
inspection of the grey-scale printout of MSS band 5 and ground truth 
was used to select particular fields to use as training fields; i.e. 
those fields which were partially harvested and those with a confusion 
of symbols were discarded. Another reason for selecting fields was 
to compare parameters from one pass with those from another as described 
in this report. 


70 


As a first step, the covariance matrices of the most important 
crops were compared and tested within frames and between frames. 
Tables 41 and 42 show the pertinent data. 

The test criterion was computed and indicates that the within- 
crop covariances are statistically different. Also, the covariances 
between frames for the same crops were tested and are significantly 
different. 

This would indicate that quadratic discriminant analysis could 
produce better results. In addition, a method of signature exten- 
sion would be complicated if one wished to go from one frame to 
another. 

2. The next step was to employ the quadratic classifier for the 

training data. The classification based on these select fields 
is presented in Table 43. 

The overall performance was 91.2%. The classification used the 
standard pointwise quadratic discriminant functions found in 
LARSYS with the added feature of allowing unequal prior probabili- 
ties for the different crops. The unequal prior probabilities 
use information that is available about the likelihood of certain 
crops. If, for example, corn is more likely to be encountered 
than grain sorghum, corn is given a higher chance of occurrence. 

In most classifications using unequal prior probabilities done 
in Kansas, the prior probabilities were: 

1) Alfalfa - .03 

2) Pasture - .72 

3) Corn - .09 

4) Grain Sorghum - .16 

Prior probabilities in this report were computed from a probability 
survey conducted by the Statistical Reporting Service in June 1972, 

(June Enumerative Survey). 

In Table 43, the number of pixels to be classified are not proportional 
to the prior probabilities. The prior probabilities are based on acreage 
of all segments in the Crop Reporting District, and not the segments In 
frame 1060-16512. Development of proper prior probabilities for areas 
divided by LANDSAT passes presents additional problems. A better corre- 
spondence would have resulted in higher overall classification; however, 
91.2% is very good. 


71 


Table 41 — Covariance matrices and mean vectors for frame 1060-16512. 
(September 21, 1972). 


Alfalfa 


n * 43 


Fasture 


n =■ 6378 


Corn 


n - 332 


Grain Sorghum n » 508 


Mean 

Covariance 

26.63 

3.430 

19.58 

4.531 8.535 

50.81 

-2.357 -8.199 27.346 

30.28 

-2.751 -7.357 16.363 12.301 

29.70 

10.926 

26.36 

12.975 21.821 

56.88 

10.351 12.698 22.487 

20.07 

4.405 4.332 11.388 7.339 

31.63 

46.883 

29.71 

77.701 133.003 

43.03 

26.525 42.905 33.798 

24.84 

2.728 -6.399 11.275 10.978 

32.21 

115.096 

27.32 

130.402 154.965 

43.78 

78.251 85.757 76.431 

25.65 

18.089 16.152 29.548 18.198 


72 


Table 42 — Covariance matrices and mean vectors for frame 1061-16570. 


(September 22, 1972). 

Mean 

Alfalfa n = 78 

24.23 

15.96 

55.61 

34.51 

Pasture n = 320 

28.62 
25.53 
35.98 

19.81 

Corn n = 337 

24.52 
19.91 
36.88 

22.82 

Grain Sorghum n = 177 

27.16 

22.76 

43.69 

27.09 


Covariance 


8.180 




12.793 

24.701 



-18.345 

036.494 

,71.234 


-15.063 

-29.604 

50.802 

39 . 313 

5.290 



— 

6.109 

11.002 



3.534 

3.061 

19.272 


1.056 

0 

11.213 

8.237 

1.877 



— 

2.183 

9.120 



0.339 

-5.114 

17.056 


-0.081 

-5.291 

11.039 

8.820 

32. 718 



— 

49.217 

77.088 



2.100 

2.865 

16.646 


-15.639 

-24.393 

10.975 

19.448 


73 


Table 43 — Classification matrix for September 21, 1972 MSS bands 4, 5, 
and 7, using quadratic discriminant functions with unequal 
prior probabilities in Kansas test site for select fields. 


Class 

No. of 
sample 
points 

' Percent 


Number 

of samples classified into 

[Correct 

Alfalfa: 

Pasture 

; Corn j 

Sorghum 

: Threshold 

Alfalfa. . 

43 

100.0 

43 

0 

0 

0 

0 

Pasture. . 

172 

98.3 

0 

169 

2 

1 

0 

Corn ..... 

51 

90.2 

0 

1 

46 

4 

0 

Grain 





Sorghum. . 

78 

69.2 

0 

10 

14 

54 

0 

Totdls • • • 

344 


43 

180 

62 

59 

0 


Overall performance 91.2% 


A classification was then done using all identifiable fields in the 15 
segments. The results of this classification are presented in Table 44. 
The overall performance was 90.2%. 

There was a small decrease in overall performance between Table 43 and 
Table 44. However, a random sample of ground truth yields a better 
representation of all land and allows statistical inferences about the 
pixels. 

The second pass required to cover the Kansas test site was analyzed in 
the same way as described above. The second scene contained 23 segments, 
but one of these segments fell in a non-agricultural area. In addition, 
to the random segments, two additional segments were selected which con- 
tained sugar beets. 

Table 45 presents the classification of select fields for the second pass. 
The fields were selected from the grey-scale printout as described above. 
The overall performance was 75.5%. 


74 


Table 44 — Classification matrix for September 21, 1972 imagery (MSS bands 
4, 5, 6, and 7), using quadratic discriminant functions with 
unequal prior probabilities in Kansas test site. 


Class 

No. of 
sample 
points 

'Percent j 
[Correct^ 


Number 

of samples classified into 

Alfalfa: 

Pasture : 

Corn 

: Sorghum 

: Threshold 

Alfalfa. . 

43 

93.0 

40 

2 

0 

1 

0 

Pasture. . 

6378 

95.0 

23 

6061 

123 

142 

29 

Corn 

332 

37.7 

38 

110 

125 

59 

00 

Grain 

Sorghum. . 

508 

64.8 

38 

77 

60 

329 

44 

Totals . . . 

7261 


139 

6250 

308 

531 

33 

Overall performance 90.2% 


Table 45 — Classification matrix for September 22, 1972 imagery (MSS bands 
4, 5, 6, and 7), using quadratic discriminant functions with 
unequal prior probabilities in Kansas test site for select 
fields. 


Class 

No. of: n : 

, Percent 

samp le ; c orrect J 

points: : 


Number of 

samples classified 

into 

Alfalfa 

: Pasture : 

Corn: 

•Grain . Threshold 
: Sorghum : 

Alfalfa. . 

78 84.6 

66 

12 

0 

0 

0 

Pasture. . 

230 93.0 

0 

214 

11 

5 

0 

Corn. .... 

337 65.0 

o 

93 

219 

25 

0 

Grain 


Sorghum. . 

177 63.9 

3 

34 

18 

122 

0 

Totals. . . 

822 

69 

353 

248 

152 

0 

Overall performance 75.5% 




Table 46 represents a classification of the second scene, using all 
identifiable fields. The overall performance was 65.8%. This 
decrease in performance could be attributed to several things. The 
number of crops being classified was increased from four to seven. 
Increasing the number of crops will reduce the performance. Secondly, 
there was a confusion between most crops and pasture. This could have 
resulted from using late September imagery; all crops are spectrally 
similar. Thirdly, the frequency of the data pixels presented for 
classification differed drastically from the prior probabilities used. 

Table 47 is a classification study using the same select training 
fields that were used in Table 45. However, in Table 47 equal prior 
probabilities were applied. In Table 47, the overall performance 
at 79.2% is actually better than the 75.5% in Table 45. Applying 
prior probabilities based on all fields to a non-random selection 
of fields in a particular area is the cause for the lower classifica- 
tion in Table 45. 

Table 48 presents a classification of all identifiable fields in scene 
1061-16570, using equal prior probabilities. This table is comparable 
with the weighted classification presented in Table 46. The overall 
performance was increased 4.4% by using prior probabilities. When all 
fields are used in the classification, the total acres per crop more 
closely estimate the true prior probabilities of the model. 

The Increase caused by using unequal prior probabilities in Kansas 
was not as great as it had been In other areas. The smaller gain 
from prior probabilities is perhaps caused by the fact that the 
LANDSAT data contained more information; i.e., the classes were more 
separable. Thus, the expected gain from prior probabilities is 
greater in areas where classification Is poorer. 

3. The correlations between acres and pixels were calculated. Coordi- 
nates of ground truth segments were carefully defined. The training 
data from each scene were used to classify the segments in that scene. 
The classified pixels in the two scenes were then combined (i.e., 
Tables 44 and 46 were combined) and correlations with known ground 
truth acreage were computed. 

Correlations between acreage and pixels were calculated as follows: 


Total Acreage vs 

Total Pixel 

r 2 - .88 

r * 

.94 

Pasture Acreage 

vs Pasture Pixel 

r 2 ** .84 

r « 

.92 

Corn Acreage vs 

Corn Pixel 

r 2 = .62 

r = 

.79 

Grain Sorghum vs 

Grain Sorghum Pixel 

r 2 » .58 

r = 

.76 


76 


Table 46--Classif ication matrix for September 22, 1972 imagery (MSS bands 4, 5, 6, and 7), usin; 
unequal prior probabilities, Kansas, all fields. 


o <n o si- ca o 

CM 


VO O O O O CN O 


3 M-U lO w IA in ^ lO 

•h qj«o h in ffi ^ H <n 

HI CD i — t CN H rl 


H 3 oq 

o In H 


rwsncrhcH 
cn r'' oo sf ia 
rH IA 


N A A vO <f rl 
rl <t O' A rH ^ 

vO CO 


B W 

3 <T) 


S 00 <f OH inN 

m o oo o 1*1 co 
m io a <t n 


N O' H Oi O lO 
vO rH 00 rH rH rH 
rH 


4 ID CO A A VO O 

10OOAA4O 
A O' 4 A H iO 


44 

CD 

0 ) 









O 

rH 

HI 


m 

00 

cn 

CA 

OO 

*n 

m 


CL 

3 

00 

r-* 

CT\ 

VO 

VO 

o 

CN 

CM 

• 

S 

■H 

CN 

a> 

vo 

00 

OO 

in 


CM 

o 

3 

o 


< 3 * 

rH 

CN 


H 


CM 

S 3 

w 

a 










« » • £j . . 

. . . 3 u . . 

• « . X* 3 « CO 

. . . CO 0) .HI 

. . . Hi _C * O 

. . . O 3 . QJ 

3 0) • W . PQ 

i« H . Jh Sc 

rH 3 • 3 0) O Hi 

« W C H W H K 

H 01 C (d fj H M 
H fl O H H « 3 

<i (ii u C S tn cn 



Table 47 — Classification matrix for September 22, 1972 imagery, MSS 

bands 4, 5, 6, and 7, using quadratic discriminant functions 
with equal prior probabilities in Kansas test site for select 
fields. 


Class 

No. of 
sample 
points 

[Percent 

[Correct 


Number of 

samples classified 

into 

l 

[Alfalfa 

: Pasture : 
• • 

Corn 

: .Grain : Threshold 
: Sorghum: 

Alfalfa. . 

78 

84.6 

66 

11 

0 

1- 

0 

Pasture. . 

230 

75,2 

3 

173 

38 

16 

0 

tnrjl. .... 

337 

87.5 

0 

29 

295 

13 

0 

Grain 








Sorghum. . 

177 

66.1 

14 

16 

30 

117 

0 

Totals > • • 

822 


83 

299 

363 

147 

0 

Overall performance 79.2% 


When pixels and acreage are this highly correlated, remotely sensed 
data is beneficial. 

4. In this study, the statistics compiled on one LANDSAT frame were used 
to classify points in the adjacent frame. As described earlier, two 
adjacent passes were used to obtain necessary coverage of Kansas. 

The select fields from both scenes (as described in Section A) , had 
four classes (alfalfa, pasture, corn, grain sorghum. These four 
classes were also the classes for the "all fields" in frass-e 1060-16512. 
One requirement is that the same classes be used for training as those 
classified. The classification used the quadratic discriminant func- 
tion with unequal prior probabilities. 

Table 49 presents the results of classifying the select fields in 
frame 1060-16512, using training statistics generated from select 
fields in frames 1061-16570. The overall performance was 54.4%; how- 
ever, the average performance by classes 1/ was 33.3% correct classi- 
fication. The 100% correct classification of the pasture class 
greatly influenced the overall classification. 


1 / 

The average performance by classes is computed by averaging the percent 
identified for each class. 


78 





I 




Table 49 — Classification matrix of select fields in frame 1060-16512 

classification, using statistics from select fields in frame 
1061-16570. 



:No. of:- 

, Percent 

:sa ” p i e: Correct 
points : 


Number of 

samples classified into 

UlaSS 

Alfalfa 

: Pasture: 
• • 

• • 

Corn : 

Grain 

Sorghum 

: Threshold 
• 

[Alfalfa. . 

nra 

0 

41 

0 

1 

1 

'Pasture. . 


0 

172 

0 

0 

0 

[corn . .... 
Grain 

51 0.0 

3 

7 

0 

41 

0 

Sorghum. . 

78 33.3 

7 

28 

15 

26 

2 

lot ills • • * 

344 

10 

248 

15 

68 

3 

Overall performance 54.4% 







Table 50 is a classification of all identifiable fields in the seg- 
ments in frame 1060-16512, using the statistics generated from the 
select fields in frame 1061-16570. The classifications with an 
overall performance of 65.5% and an average class performance of 48.5% 
are very good. Here again, it was the correctly classified pasture 
points which kept the averages high. In Table 50, more fields were 
classified and the influence of prior probabilities was more benefi- 
cial than in the cases where select fields were classified. 


Table 51 shows a classification of select fields in frame 1061-16570, 
using statistics generated from all fields in frame 1060-16512. In 
this study the overall performance slipped to 49.0% but the average 
class performance was 59.1%. Classification was very good in all 
classes except corn, which was confused with pasture and grain sor- 
ghum. The time of year may have caused this confusion. 

5. The border of Stevens County, Kansas was drawn on a grey-scale map of 
MSS band 5 .. The area was then defined on punch cards and classified. 
Training data for the classification were obtained from segments in 
the Crop Reporting District which contains Stevens County. Three of 
these segments were actually in Stevens County. A total of 410,505 
pixels were classified which correspond to a calculated 466,560 acres 
in the county. 


REPR0DUCIE- ‘ 
ORIGINAL VMxti 


80 



- . Jii »dW«s*HS,55K 


Table 50 — Classification matrix of all fields in frame 1060-16512 classi- 
fication, using statistics generated from "select fields" in 
frame 1061-16570. 


Class 

No. of 
sample 
points 

• • 
’Percent’ 


Number of 

samples classified into 

\ Correct * Alfalfa 

: Pasture : 

Corn 

. Gram . 
: Sorghum: 

Threshold 

Alfalfa.. 

43 

65.1 

28 

3 

0 

12 

0 

Pasture. . 

6378 

9372 

7 

5943 

11 

277 

140 

Cnm ..... 

332 

7.5 

8 

79 

25 

204 

16 

Grain 



Sorghum. . 

508 

28.3 

16 

105 

75 

144 

168 

Totals. . . 

7261 

• 

59 

6130 

111 

637 

324 


Overall performance 85.5% 


t 


Table 51 — Classification matrix of select fields in frame 1061-16570 

classification, using statistics generated from "all fields" 
in frame 1060-16512. 


Class 

No. of:,, : 

- Percent 
sample : 
Correct 
points: : 


Number 

of samples classified into 

Alfalfa 

: Pasture 

: Corn 

. Gram . 
: Sorghum: 

Threshold 

Alfalfa. . 

78 80.8 

63 

12 

0 

0 

3 

Pasture* » 

230 94.3 

0 

217 

4 

8 

1 

Corn ..... 

337 9.2 

5 

140 

31 

161 

0 

Grain 



Sorghum. . 

177 52.0 

12 

30 

43 

92 

0 

Totals. . . 

822 

80 

399 

78 

261 

4 

Overall performance 49.0% 


81 








l 


I 




Alfalfa, 

following 

pasture, corn, and grain sorghum were 
classification was obtained: 

the crops 

classified. The 

Number of 
Pixels 

Alfalfa 

Pasture 

Corn 

Grain 

Sorghum 

Threshold 

410,505 

5,362 

172,021 

30,448 

165,107 

37,567 


1.3% 

41.9% 

7.4% 

40.2% 

9.2% 


The prior probabilities as a percentage which were applied were the fol- 
lowing : 

Alfalfa 3% 

Pasture 72% 

Corn 9% 

Grain Sorghum 16% 

There is confusion between pasture and grain sorghum. Ways to use this 
data to produce a final estimate will be discussed in the section on 
estimation. 

South Dakota . 

The test site in South Dakota is in the eastern part of the State. Figure 
1 shows this Crop Reporting District. 

Analysis of LANDSAT Data in South Dakota 

Objectives : 

The objective of this section was to determine the classification accuracy 
in the South Dakota test site. 

' Ji 

Approach : 

Imagery for three dates was available. However, the August and early 
September imagery was too cloudy to be useful. Thus., later September 
imagery was used. All 34 segments were contained in one LANDSAT frame 
(1060-16491). The segments and fields within segments were located and 
defined on punch cards. These segments were used for both training and 
classifying. 

The LARS classifier with unequal prior probabilities was used. The classi- 
fier is a standard discriminant analysis. 



82 


Tabid 52 presents a classification of pixels in all segments in South 
Dakota. The overall performance was 30%, but the average class performance 
was 15%. Almost all classes in Table 52 were classified as either pasture 
or oats. 

There were two reasons for this. First, prior probabilities used were 
large for pasture and oats, and second, the spectral data is quite similar 
at this period of time for all crops. 

An attempt to improve the classification results was made by selecting 
fields that looked homogeneous. 

These selected fields were used as training data and then classified. The 
results of this classification are presented in Table 53. The overall 
performance was 26% and the average class performance was 44%. There 
appears to be very little information in the data which would aid in the 
separation of crops. The influence of the prior probabilities again was 
the reason pasture and oats had high correct classification rates. 

There must be reasons for the very poor classification rates. As an 
attempt to determine the reasons for the poor results, we have studied 
the means and covariances. They are in Table 54. It appears to be impos- 
sible to separate these classes with this data. Simply looking at the data 
does not necessarily show the true multivariate situation is four dimen- 
sional - but it does give an indication. 

Summary 

In South Dakota, late September imagery was used because of cloud cover 
in earlier imagery. Classification results were poor. Examination of 
Table 54 showed very little information in the data for the separation 
of the classes of interest. This late in the season, crops were classi- 
fied as either pasture or oats. 

The use of homogeneous fields selected from gray scale printouts and ground 
truth did not improve classification, and actually reduced the overall 
performance rates. 


83 


Table 52 — Classification matrix for September 21, 1972 imagery (MSS bands 4, 5, 6, and 7), using unequa 
prior probabilities in South Dakota test site. 


o 

Idle: Fallow rThreshold 

4J 


a 

• ••• 

•H 

X 

•t) 

Q) 

'd 

0) 

3 

•H 

CO 

44 


•H 


Cfi 

X 

CO 

ns 

cd 

1—1 

pH 

Pm 

O 


CO 

• •• • 

a) 

44 

rH 

rH 

Cu 

« 

B 

44 

nS 


co 

<! 


« ••• 

44 


O 

QJ 

44 

P? 

OJ 


-n 

• ••* 

B 

>4 

P 

QJ 

25 

H 

■u 

t’Corn ’.Pasture: Oats :Bar 

C 

o 

04 

Q) 

O 

44 

u 

44 

0) 

O 

04 

CJ 

|44 QJ col 

O P- 

■u 

a. c 

• B «H 

o as o 

5S co cl 

Class 


COOOOOrHOOOO <*• 


O^J-rOOOvDOrHO'T CO 

H 04 


NMONOOOONrt O 

H c-4 


OOOOOOOOOO O 


nooHooroooo 


OOOOOHOOOO rH 


OOOOOtHCOOO 


cOr— IOOOOOOOO 


invoooi^HHnonrs 
r-» oo on H in 04 H 

04 


coooNh-inninNsfcn 
o- r4 oj 


rHOOOOOOOO 


iH^t-moocnojomcn 
• ••••••*•• 

OOOOOOO^OOsf 
00-4- H 


00jc00-c00)rfin0004 
lOH^OHONinHCO 
o oo 04 ro 

rH 


00 

0- 

in 


CO 

rH 

rH 

04 


04 


00 

m 

o- 

04 


• (D . . • trJ ■ • 

. 4-1 • . 44 • . . & 

• P « QJ .rH • X .0 

d u KH * Qj X Q) OJ rH 

4-1 CO 4-1 44 QJ 44 dJtlHH 

Ortcdcil!>iHH3't)ca 

UdiOWpJ<JfiicOHp4 


co 

pH 

rt 

4J 

o 

Eh 


6-S 

O 


o 

m 


a 

to 


o 

44 

M 

QJ 

Ou 

rH 

rH 

rt 

M 

04 


84 



Table 53 — Classification matrix for September 21, 1972 imagery (MSS bands 
4, 5, 6, and 7) using quadratic discriminant functions with 
unequal prior probabilities in South Dakota test site for select 
fields. 


Class 

No. of: 
sample: 
points : 

Percent 



Number of samples 

classified 

into 

Correct 

Corn 

: Pasture: Oats 
• • 

• » 

: Alfalfa: 
• • 

• • 

Sudex : Threshold 
• 

• 

Com 

237 

6.8 

16 

150 

54 

17 

0 

0 

Pasture. . 

75 

88.0 

0 

66 

7 

2 

0 

0 

Oats 

12 

100.0 

0 

0 

12 

0 

0 

0 

Alfalfa. . 

110 

25.5 

1 

56 

24 

28 

0 

1 

Sudex. . . . 

36 

0.0 

0 

30 

6 

0 

0 

0 

Totals. • • 

470 


17 

302 

103 

47 

0 

1 


Overall performance 26.0% 

















Table 54 — Means and covariance matrices for crops in South Dakota on 
frame 1060-16491, September 21, 1972. 


Corn 

Means 

Number 1060 


Covariance 

Matrix 



22.34 


4.84 





17.69 


6.73 

13.25 




31.40 


2.67 

-0.42 

33.40 



19.38 


0.37 

-2.95 

25.55 

18.15 

Pasture 

Means 

Number 812 


Covariance 

Matrix 



23.94 


5.42 





19.89 


7.79 

15.13 




34.34 


1.14 

-1.48 

29.59 



20.85 


-0.69 

-3.78 

18.72 

13.99 

Oats 

Means 

Number 243 


Covariance 

Matrix 



23.13 


9.92 





19.09 


16.72 

33.29 




32.98 


10.76 

14.40 

43.16 



17.74 


4.38 

4.48 

25.26 

16.73 

Barley 

Means 

Number 97 


Covariance 

Matrix 



24.52 


5.47 





21.46 


6.25 

11.15 




30.07 


5.93 

5.41 

25.70 



17.51 


2.65 

1.54 

16.87 

12.53 

Rye 

Means 

Number 16 


Covariance 

Matrix 



22.31 


3.31 





17.63 


2.71 

5.43 




35.06 


1.63 

3.04 

7.40 



20.94 


1.02 

1.83 

3.78 

2.19 

Alfalfa 

Means 

Number 303 


Covariance 

Matrix 



23.78 


6.81 





19.90 


9.62 

17.56 




33.15 


3.08 

1.94 

26.42 



20.09 


0.46 

-1.61 

16.19 

12.25 

Flax 

Means 

Number 71 


Covariance 

Matrix 



22.30 


5.66 





18.25 


5.39 

8.64 




27.63 


7.99 

6.27 

41.73 



17.55 


4.30 

2.59 

27.63 

19.45 

Sorghum 

Means 

Number 55 


Covariance 

Matrix 



22.51 


2.79 





17.25 


3.00 

6.60 




32.15 


1.44 

-1.97 : 

23.04 



20.05 


0.42 

-2.38 15.76 

12.74 


86 


Table 54 continued 



/' 


Idle 

Means 

23.05 

Number 19 

9.86 

Covariance 

Matrix 

\\ 

19.00 


14.74 

26.62 


• v< 

31.58 


7.79 

5.45 

27.88 


19.63 


0.43 

-3.92 

14.94 

Winter 

Means 

Number 82 


Covariance 

Matrix 

Fallow 

23.41 


5.47 




19.78 


9.58 

20.70 



32.21 


-1.27 

-5.75 

36.24 


19.27 


-2.77 

-7.65 

20.93 


Idaho : 

The test site in Idaho covers nearly four counties. The Crop Reporting 
District boundaries were bypassed because they did not include some 
areas, of homogeneous types of agriculture that should have been included. 
Figure 4 shows the test site area. 

The results are based on 42 segments in the intensive agriculture stra- 
tum in one LANDSAT frame. Two additional segments are not on this 
frame. The frame that contains these two segments also contains ten 
segments which are on the first frame. Therefore, it may be possible 
to use this overlapping data to calibrate from one frame to the next, 
or to measure the difference due to frames in the means and variance 
for the overlapped data. A method of using calibration or training data 
in one frame to adjust parameters or to classify on another frame would 
be valuable, since, it would increase the value of the segment data. 

A crop may be different over a large area because of variety, soil 
type, weather conditions, and state of maturity rather than technical 
factors associated with acquiring imagery. However, it may be possible 
in some areas to do signature extension and this problem should be 
investigated. 

The data had serious banding problems. The problems seem to be most 
apparent in band 5, therefore, that band was left out of the first 
classification. Table 55 shows this first classification. 

Obviously, the classification is not as good as we expected; however, 
by chance, one would expect only 8% correct classification for 12 crop 
categories. Another possible problem with the classification is that 
some field boundaries, sometimes, fall on adjacent points and since the 
pixels are partially overlapping, these border pixels may be causing 
some overlap of the crop categories. The grey-scale printout (Figure 
14) which follows illustrates this problem. 


87 


Flftwre X4 — Gray scale printout of a segment showing how fields are defined. 




STS 


658 I 

T 


663 I 668 1 673 I 678 » 683 I 


T 


T 


• 4 • 


, i.iii No . 

1 *' <• V I 

I 

1 $ 4 0 I 

1050 1 

1051 

1052 
1050 
1556 1 

1555 

1556 

1557 1 '* 

1355 I • 

135Q 
1360 I 
1061 I. 

1562 I. - i 
1063 I ... 

1E1A4 | 

1065 1 
1566 I .... 

1067 |Y*f *5 M /. i 

|M^<S 

106Q | % , 


T 


688 • 

r 


693 * 6981 703 1 708 


713* 718 1 723 


T 


T 


1* 


■t 


■f 


• _ _ . - 

• t * • • • • • • • -•*"•••• # « ** • " t " * 

— _ + 

# • f t • • •###*••••••••• ••• 1 1 1 # 1 

- •• — •••::* 



v ^\ m / 
•» *A\' 


f A V j \ 

• , ,V 1 v ’ \ M Y 


s r • * .• -A W # Y’ 

} . t »;*.» - A’. ’ {' V / » M ! 

} * o 3«-j**a* f><; y ^ : * • V y M'v* y . ' • 

| «.*• n m *-j a » a * / - \ y miyA . . .1 - ^ ' - - - 

|**3*m<o*- .. 

1 f 5 *y . r n . 


„ _ . 

/• . I . . . * . * 

///. 

-YMV/-/// — 

MV/-//—.. 

.-.•../M7MYVV/ 

_....*Yf«fMY €# /v/ / — 

’ /MVY/YMMV Y/ , , 

/M/., -Y/V'tM/ 

,\ 




• a 


rl 


/ / 

-// 


j f, f» W | t i .* ‘J ■*" * » * — f f r * ~ S ~* * \ ^ ■ ■ » * ■.■.»■ 

1 S 7 0 |. Ms <:«■«* t wy v * * V i • - ^ 'J **.' *V v • 

1071 S r # • • * V'-YVV: y' m 

1372 

1373 

1374 

1975 

1976 t ,/M*iM'«‘Y//Y“??'V. r tt * . 

1377 *..~Y*««9*WY. — 

1979 

1979 M/ 

19P0 I — 53 . 

Iqqi 

1 982 I *VY***V ! ««8tt •’•••'« 4MM*t«|;aiMMYY*iMY-. ✓/---// — /•-//*- 

168 3 1 MMYY3.- }*?•? 5**M*SY/«5 v Y* , * < mM Y — . , . , • — //--/ 

| | ! I I I I I « - * » » ’ 



l I I I i i i j ! -j ! 

658 » 663 t 668 I 673 I 678 I 683 I 688 I, 693 I 698 I 703 I 708 I 713 » 718 • 


I 72 3 


LANDSAT Column Number 


88 


reproducibility of the 
ORIGINAL PAGE IS POOR 


p 

o 


u 















a. 

5 

o 

o 

O 

O 

o 

© 

O 

O 

o 

© 

o 

o 

0 

H 

M 














(0 















0 















XT 

CO 














0> 

w 














e 

o 


* 












P 

(s 

CM 

o 

o 

m 

pH 

<n 

O 

CM 

00 

o 

st 

© 

in 


< 



pH 

CM 






pH 

in 


rH 

d 

H 













rH 

a 

O 














cd 

(d 














r* 
















CO 














d 

H 

o\ 

st 

St 

CM 

pH 

VO 

O 

cn 

cn 

cn 

00 

cn 

VO 

c 

CO 

00 


VO 

CM 

VO 

CM 



oo 

© 

is 


m 

cd 

o 




St 

pH 





st 

CM 


in 


& 













rH 

* 

CO 














in 















* 

Ed 
















is 

CM 

Ov 

rs 

pH 

m 

m 

m 

•s 

VO 

Ov 

Ov 

CN 


p 

CM 

m 

00 

CM 

CM 

rs 

cn 


Ov 

st 

00 

st 

rH 

0) 


fp| 

pH 

St 

in 

CM 

in 

pH 

y: 

pH 




00 

d 

xn 









pH 




cn 

a 

< 










tt 




cd 

(Li 














42 















cd 

44 

g 

o 

O 

O 

o 

O 

o 

O 

o 

pH 

o 

O 

o 

rH 

ed 

X 














d 

o 














CN 
















Ed 














ov 


o 

O 

VO 

VO 

O 

cn 

m 

o 

OV 

VO 

o 

o 

m 

iH 

n 






cn 

m 


St 




in 


H 













rH 

44 















(A 















=> 

S3 















H 














d 

O 

o 

o 

CM 

o 

o 

cn 

pH 

o 

CM 

O 

© 

© 

00 

< 
















< 














60 

El 














a 















■T3 















<0 

p 

tt! 

o 

o 

O 

o 

o 

o 

O 

o 

O 

o 

o 

o 

O 


O 














«d 

a 














44 















cd 















d 

<3 














cd 

d 

pH 

00 

rs 

oo 

Ov 

st 

pH 

o 

oo 

cn 

© 

- m 

<f 

0) 

<3 

CP) 


CM 

CM 

rH 

pH 



is 

st 

00 


cn 

u 





CM 

pH 








vo 

cd 

d 





























d 















S3 

44 

3 

pH 

m 

rs 

CM 

OV 

cn 

O 

o 

st 

pH 

pH 

cn 

VO 

CO 

03 


'S’ 

rH 










00 


cq 



pH 










rH 

O 















X 















cd 

CO 














d 

> S3 

CO 

CM 

pH 

pH 

pH 

• cn 

O 

o 

m 

m 

CM 

00 

<r 

M 

5 3 


v£> 

rs 

m 

CM 

pH 

pH 


CM 



st 

m 


■3 w 


m 

CM 










0 

M4 

S3 ca 













rH 

O 















a 

CO 














o 

CO z 
<; <3 

St 

fp| 

CP) 

is 

O 

St 

St 

o 

00 

CM 

Ov 

cn 


44 

W Ed 

00 

pH 

cn 

m 

pH 

pH 



cn 

pH 

CM 


ON 

cd 

(L, « 













CN 

o 















•H 

i 1 














H*4 

•H 

4J 4J 

0 CJ 

«n 

pH 

m 

cn 

O 

st 

is 

pH 

is 

m 

H 

O 


(0 

0) V 

• 

• 

. 

• 

. 

• 

• 

• 

. 

• 

• 

• 


CO 

o p 

<fr 

pH 

pH 


C 

© 

VO 

Ov 

© 

vO 

o 

o 


cd 

U p 

rH 

rs 

rH 

rH 



CM 


oo 

Is 

rH 



rH • 

0 o 














O CO 

(Li O 














0) 















>*-H 















M ^ 
cd *H 

CO 

4-1 0) 

ON 

ss- 

Ov 

00 

CM 

st 

VO 

pH 

<r 

IS 

cn 

rH 

00 

C rH 

O pH 

r^ 

00 

pH 

rH 

St 

© 

o 

pH 

oo 

CM 

CO 

rH 

ov 

♦H *H 

CL 

in 

rs 

O 

CO 

*n 

VO 

CM 


st 

in 

m 

H 

r^. 

B 4^ 

* § 



pH 

H 





rH 





•H cd 

o 3 














rH 42 

S3 (0 














0) O 















M $4 















CM P* 







•a 








} 



T3 




0 


5>v 






m 


d 

V 




0 





CQ 



in 


§ 

LJ 

a 

Eh 

cd 

MH 


> 


S3 

0 

p 


S 

60 


<u 


<0 

0 M 

0 

H 


O P 


p 

3 

p m 

LI 

d y 

rH 

rH 


CO 0 

> a 

H 

cd 

0 

pH 0 

0 

0 

LI 

0 LI 

0 

*h cd 

cd 

n 


0 0 

H 0 

P 

MH 

p 

pH rC 

rH 

J= 

a 

00 0 

LI 

u a) 

44 

cd 


0 V 

0 0 

0 

H 

o 

0 LI 

TJ 

LI 

0 

3 0 

o 

45 

O 

*v 

H 


(2 (3 

S3 W 

CQ 

< 

o 

Ed O 

H 

O 

Ed 

w (3 

Ed 

xn & 



m 


Overall performance 34.7 percent 


I 


It is obvious that many groups are very similar, and therefore, misclas- 
sification is high. We will try combining several into groups based on 
similarity of the estimated parameters, since these initial results 
indicate a number of crops are not distinct. 

The next classification matrix uses equal prior probabilities and is 
presented in Table 56. The overall classification performance is 21.8%. 
This points out that prior information In terms of probabilities is 
also important in this test area. 

Since the data had serious banding problems, it was thought that perhaps 
this caused the extremely poor classification rates. As a result, NASA 
Goddard was asked to reprocess the image to remove the banding. 

*tie image was reprocessed at considerable expense to Goddard and the 
classifications were again run. The results are shown in Table 57. 

Table 58 is a result of combining classes after classification. It is 
obvious that going to fewer categories does improve the classification. 
However, in Idaho, where many crops are grown, the imagery must contain 
information that will allow users to separate the various crops. Per- 
haps, temporal information would improve the value of the Idaho imagery. 

Results of Classification of Aerial Photography 

Since aerial photography is in image form and computer techniques require 
digital data, it is necessary to convert the photographs to optical 
densities. A detailed explanation of how this is done may be found in 
Appendix D. The aerial photography was scanned by a Photometric Data 
System (now Bolen ahd Chivens) microdensitometer. This instrument 
records optical densities (or transmissions) of wavelengths of light 
corresponding to given color filters. Each time a filter is changed, 
however, the instrument must be recalibrated. The values recorded range 
from 0.00 to 4.00 in optical density. In brief, the range of values 
is spread between the chosen calibration point and total darkness. 

Initially, the procedure for scanning segments was as follows: 

An interval point within the photograph was chosen. This point was the 
con ® i ^ red lightest spot on the exposed portion of film and it was set 
at 0.00 on the microdensitometer scale. South Dakota, Kansas, and 

Idaho photography was scanned and the results brought to light problems 
in this technique. 


on 


o m 

CM C\| 


"3 o co 

H 00 


00 00 00 

N N O 

rH 


JO O O 00 
tn sr 


»© to 

<M O 

CM CO 


° ° S JO CM 

*n on. 


cm co 


2 fj <o >» o 
o< o r-t co 


a ■* *• 


O' N ifi 

to r* in 


co co co 

N 


cm m 

*» o 


O VO 

oo co 


00 sf 


•4- o 
m 
m 


*-pH 

H 

\o 

00 

VO 

VO 

O 

s 

CM 

rH 

vO 

CM 

§ 

on 

Mf 

Ov 

rH 

o 

o 

ss 

H 


rH 

in 

d 

ON 

m 

CO 


<3 

§ 

CM 

iH 

rH 

<r 

rH 


a 

pH 

O 

rH 

rH 

Cm 

CQ 


•<» 

O 

*H 

rH 

CO 

z 

CO 

00 



2 

m 


*•4 

in 

rH 

CM 

Mf 

CO 

5 

00 

o 

CM 

ON 

<c 

w 

m 

Mt 

rH 

CM 

VO 

rH 

rH 

p 

CJ 

0) 

VO 

• 

H 

• 

OV 

• 


M 

m 

VO 

ON 

o 

U 

O 

CM 

VO 


•H 


£ S ° ^ o 


®* O' H IM <J- 


O O 


oo «n 

rH CM 


VO o o 


o* o in 


co ov 

CM 


cn o 


vO OO 

o 


'O «a- 

-a- 


O Ov 
♦1 O 
CO 


O rH 
Ov 


00 Ov 

CO O 

Ov 


00 VO 

to 

in 


H 

sf 

n. 

o 

ON 

GO 

• 

CM 

• 

O 

. 

CM 

• 

00 

• 

ON 

• 

vO 

H 


m 


rH 

m 


00 

CM 

«a- 

NO 

rH 

va- 

00 

o 

cn 

rH 

in 

vo 

CM 


rs to 

CM CO 

m in 


§ 

n 

p 

m 

0 ) cn 

0 

flj 

H 

M c 
m n 

t g 

rH 

to 

IH 

V u 
pm (3 

as 

iS 

3 


w 

0) 

o 

u to u 

to P CO 

« 41 u 

Cfl (S (2 


00 

C P rH 

•H tO tt 

M 0 P 

Ov £ O 

«> 5 H 


Overall performance 21.8 percent 


xs 

w 


REPROT'TTCIEIi 
ORIGINAL FAG 


T3 

s 


X -H 
•H ,0 

m <o 
U X* 
CO O 
6 U 
Pm 

g 

O M 
•H O 
VI -H 
CO V) 
O PL, 
•H 

<VI >H 
•H CO 


3 

a 4 


co a) 

rH 

0 

I 

1 

ts 

m 


•8 

H 


a, 

co 


WD 

w 

o 


m 


vo 


•> 


H 

m 

o 

VO 

cn 

00 

cn 

CN 

in 

vO 

CN 

O 

VO 


<2 




CN 




st 

00 


oo 

r> 


H 

O 











rH 

■n 


PM 












<r 


CO 












CO 


H 

st 

vO 

VO 

o 

OY 

cn 

N 

VO 

o 

<r 

rs 


pq 

Oy 


cn 

m 

VO 


cn 

rH 

CO 


00 

T) 


o 




CN 




CN 



r-s 

G 


D 











(0 


CO 












P0 














C/3 


8 

in 

rH 

rH 

H 

VO 

CN 

rs. 

© 

CN 

cn 

VO 

x: 


£3 

cn 

IN 

cn 

cn 

oo 

rH 

rH 

cn 

OY 

CN 

vO 

00 


H 

rH 

rH 

cn 

cn 

rH 

st 

OY 




m 


CO 











CN 

G 


< 











*H 


PM 












CO 














£3 


33 












fc 


H 

rH 

CO 

cn 

CN 

CN 

fH 

00 

H 

o 

cn 

OY 


O 

VO 

cn 

00 

cn 

m 

OY 

rH 




CN. 

0) 

00 

c0 


g 

Pm 






CN 

CN 




rs 

0 













M 














cn 


£ 

st 

fH 

OY 

o 

vO 

cn 

00 

oo 

fN. 

rH 

Is 

IN 


p 




cn 

st 






rH 

OY 

r-i 


u 











rH 

VI 


< 












CO 


Pm 












3 


d 

cn 

rH 

m 

rH 

vo 

rH 

m 

o 

m 

st 

rH 

00 


<i 

pLI 

d 

eg 

rH 

vo 

o\ 

o 

cn 

rH 

vO 

rH 


CN 

3 





cn 

rH 


3 

rH 


OY 

■*Y 














CO 














a) 


>i 












Vi 


H 

CTV 

vo 

00 

st 

o 

r*. 

St 

m 

o 

st 

rs 



W 


o 

St 

cn 

CN 


CN 



CN 

r*. 

>. 


CQ 


rH 

CN 







st 

tJ 














3 

> 

CO 












vi 

z 

vO 

OY 

00 

o 

cn 

rH 

00 

rH 

O 

fS 

cn 

C/l 

P$ 

w 


o 

o 

cn 

rH 





CN 

O 


■< 


m 

rH 







rs 

o 

33 

pq 












•s • 

TJ CO 
M HI 

CO 

CO 

m 

OY 

00 

CN 

CN 

00 

rs 

o» 

m 

CN 

in 

•H 



«N 

rH 

vO 

OY 

NT 

CN 

O 

rH 

rH 

rH 

CN 

UN VI 

O *H 

Cd 

Pm 

PQ 

cn 



rH 


rH 




r- 



55 u 

vO 

vO 

OY 

00 

m 

St 

O 

O 

oo 

oo 

w w 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

y g 

o 

CN 

tn 

OY 

00 

rs 

st 

VO 

rH 

cn 

OS as 

St 

vO 


CN 


cn 

VO 

m 

CN 

W O 
CM O 











01 











Pm 0) 
O rH 

OY 

<n 

r>. 

St 

rH 

OY 

cn 

vo 

in 

st 

& 

't 

rH 

m 

-H 


is r> 

cn 

00 

O' 

o 

• § 

m 

00 

o\ 

cn 

m 

rs rs. 

st 

cn 

cn 

rH 

© s 

55 co 




rH 



rH 





x> 

<u 


TJ 

§ 


CO 


cn 

>Y 

vw 


5 


to 

a) cn 

<u 

rH 


o u 

0) 

G 

> c 

rH 

10 

C 

rH 0) 

10 

m 

IN CO 

In 

>H 

U 

rH jC 

<u 

o» 

co a) 

o 

rH 

o 

CO VI 

PM 

pq 

33 pq 

pq 

< 

o 

Pm O 


0) 

VI 

CO 

CO 

PM 


U <0 
CO vi 
00 «l 
3 0) 
tfl CO 


CO 

CD 

o 

■u 

CO 

w 

<s 


00 

a. J 

w E* 


cn 

t- 


7j 

vi 

o 

H 


92 


<o 

Vi 

i» 

o 


performance 40.3 percent 


Table 58 — Classification matrix of Idaho with unequal prior probability 
groups - Table 57 collapsed into 7 groups. 


Group 

No. of 
samples 

Percent 

Correct 

Beans 

Small 

Grains 

:Corn 

Fallow 

Pasture 

Sugar 

Beets 

Potatoes 

Beans. . . 

1362 

55.6 

757 

118 

5 

99 

278 

100 

5 

Small 










Grains. . 

1061 

26.3 

215 

279 

10 

86 

423 

40 

8 

Cnrn . . . . 

541 

8.5 

55 

24 

46 


007 


a 

Fallow. . 

779 

37.4 

29 

7 

3 

J 4 

291 . 

443 

v 3 

3 

3 

Pas ture. 

2747 

73.0 

337 

59 

38 

250 

1754 

284 

25 

Sugar 










Beets. . « 

386 

56.0 

20 

6 

8 

1 

90 

216 

45 

Potatoes 

395 


15 

0 

f 

0 

207 

80 

86 

Totals. . 

7271 


1428 

493 

117 

779 

3482 

792 

180 

Overall performance 47.2 

percent 







It was observed that each segment had a different calibration point 
(lightest spot), hence, there were variations in the scanning results. 

As a calibration point changed, grey level readings for the same crop 
in a variety of segments, were different. In fact, when the same segment 
was scanned twice using two different calibration (light) spots, the 
crop signatures might not appear similar. 

To overcome this defect, a new calibration technique was developed. 
Emphasis was placed on choosing calibration points which would produce 
identical results in every segment. The procedure was to focus on the 
clear, plastic circle which appears on each section of the film as the 
scanner passes across the image. This circle became 0.00 in every 
instance. Consequently, reliable crop data was acquired since all cali- 
bration factors were now constant in the scanning process. * The state of 
Missouri was scanned using this improved method and the results were 
found to be more accurate. 

Once the data has been scanned, it must be labeled for crop type. Tract 
and field numbers were provided by the use of a coordinate system and 
this data was then merged with the ground observation data. This provided 
crop labels. This labeled data can then be used for both computer train- 
ing and testing information. 


93 



The classification procedure is explained at the beginning of this section. 
However, since the calibration was done using local calibration points, 
the classifier training and the computer evaluations were performed in 
two ways. 

For example: 1. All data was pooled and used for both the training 

and testing. 

2. All data in each segment was used for both training 
and testing one at a time. 

The results were then pooled (matrix sum). The prior probabilities in 
each case were proportional to the training data and since this training 
data was used to test, it too, was proportional to the data being classi- 
fied. 

In the instance of the pooled training, the prior probabilities were the 
same for each segment. When interpreting the local training, the prior 
probabilities were different for each segment and depended on the data 
in each . For the local training, all conditions were optimal which would 
mean that the classification accuracy is maximal. 

As a preliminary check on the effect that the different calibration points 
had on the data, a cluster analysis on all data was run. The means for 
each field were computed by segment and crop. These means were clustered 
using a program written by C.T. Zahn of Stanford University. 1 J The fact 
that the means clustered by segments rather than by crops was additional 
proof of the problems which had arisen because of calibration differences. 

Figure 15 provides an overall state by state comparison of classification 
accuracy. Figure 16-19 summarize the percent correct, classification for 
major crops in each state. These figures compare both methods of train- 
ing on the same data sets; the difference lies in the results of local 
versus pooled training data. 

Tables 59-65 give the classification matrix for both methods of training. 
When local training data was used for training, a classification matrix 
was available for each segment. These segment classification matrices 
were summed to obtain the final classification matrices in this report. 


1 / 

C.T. '^ahn, Graph-Theoretical Methods for Detecting and Describing Gestalt 
Clusters," IEEE Transactions on Computers , Vol. C-XX, No. 1, 1971. 


94 




Figure 15— Comparison of overall percent classification by 

states, 1972. (///Slashes indication global class! 
fication) . 

100 
90 
80 



70 




95 


Figure 16 — Comparison of classification methods by crop, Kansas 

August 18, 1972. (///Slashes indicate global classifies 
tion. 1 



Comparison of classification methods by crop, Missouri, August 29, 1972 
(/// Slashes indicate global classification). 












ALFALFA BARLEY CORN FIEID FALLOW HARVESTED IDLE MIXED OTHER OTHER PASTURE SUGAR POTATOES SPRING wTNTE: 

BEANS' GRAIN HAY ' BEETS WHEAT WHEAT 


Figure 15-19 indicate that the training by segment (local) gave higher 
classification percentages. The differences in classification percentages 
can be attributed to three sources: 

1. difference in calibration of data when scanned by a microdensi- 
tometer. 

2. differences in the number of crop classes and the prior probabi- 
lities of each crop class. 

3. differences in the variability of a local versus pooled data set. 

Interpretation of Figure 15 is quite easy. Kansas (with some calibration 
effect and only seven crop classes) was not greatly affected by the cali- 
bration effects. However, South Dakota, with calibration differences and 
many crop classes was drastically effected. In Missouri, differences were 
slight between classification results comparing local training versus 
pooled training. 'i >■ 


Table 59— Classification of flightlines 3 and 10, by segment, using 

quadratic discriminant functions on all eight spectral variables, 
Kansas aircraft data, September 1972. 


Crop 

Percent 

Correct 

ALFA 

CORN 

FLOW 

GSOR 

HARV 

OTHR 

PSTR 

ALFA 

94.2 

1238 

0 

0 

21 

8 

36 

11 

CORN 

93.9 

0 

247 

0 

2 

3 

11 

0 

FLOW 

80.9 

0 

0 

8383 

398 

1432 

47 

100 

GSOR 

82.2 

4 

1 

51 

3325 

181 

26 

498 

HARV 

66.0 

0 

0 

1031 

489 

4797 

29 

922 

OTHR 

70.1 

37 

4 

13 

17 

14 

312 

48 

PSTR 

83.2 

18 

0 

697 

2927 

1677 

70 

26,644 

OTHERS 


525 

129 

3186 

1606 

3290 

829 

7,166 

Overall 

performance 80.7 

percent 







100 


ORIGINAL kbkm 



Table 60 — Classification of flightlines 3 and 10, on all eight spectral 
variables, Kansas aircraft data, September 1972. 


Crop 

Percent 

Correct 

ALFA 

CORN 

FLOW 

GSOR 

HARV 

OTHR 

PSTR 

ALFA 

76.0 

999 

14 

0 

107 

4 

20 

170 

CORN 

41.8 

170 

147 

0 

19 

0 

2 

14 

FLOW 

78.6 

8 

52 

8141 

352 

708 

9 

1090 

GSOR 

46.5 

102 

107 

81 

2063 

485 

62 

1540 

HARV 

41.6 

16 

42 

1816 

403 

3025 

26 

1940 

OTHR 

10.6 

43 

5 

21 

126 

37 

47 

166 

PSTR 

87.6 

3 

241 

222 

916 

2400 

180 

28,071 

OTHERS 


335 

596 

2023 

1402 

1462 

660 

10,728 

Overall performance 75.6 percent 


101 























Overall Z m 66.89 







































T.ble M Classification of flightlioee 2 and 5. using eight spectral variables. South Dakota, September 1972. 



lOverall X correct ■ 44.54 











C-l •H 

cn 


©ftcof^ov»r>.oov 
VO CM (-4 eo VO 
r-t ft "T 


'j-ftr-toorgpoooO'tfm 
ft «f M rt vD 't CO 

CN •» 


cNOOOOootNCMvovoun 

CM *3- CM Ch 

cn so 


OOOOOOCMO-Tmr- 

00 ft 

ft CM 


»nooo«neM©cneMcnoo 
CM cn CM 

CM CM 


r- o 


00rv.ft000ft«-t 

m ft ft vo 


ocnoooftvo uni'* 

ft sr 


o o 


o o o © o v© «n 


o o 
cn 


o o o 


oo ft cm un 


cn o 
r*» 

CM 


O O O CM O CM O 


O Ov 


O vO 


u u O' 

CO vo 

W V • 

oh cn 

U h OV 

O O 

CL, CJ 


CM 

H 

■«0 

VO 

vO 

o 

00 


S0 

vo 

vO 

Ov 

CM 

CO 

O 

vO 

CM 



ft 

• 

. 

• 

• 

« 

• 

• 

• 

• 

• 

VO 

o 

«n 

CM 

OV 

rH 

•o 


H 

00 

r*» 

Ov 

Ov 

OV 

00 


<r 

00 

oo 

00 


25 

g 

a 

H 

M 

05 


M 

>1 

H 

O 

U 

o 

£ 

23 

05 

Eh 

O 

w 

u 

1 

►J 

O 

M 

< 

25 

O 


Overall Z correct * 79.61 



The microdensitometer can scan a photograph and obtain either density 
values or transmission values or both. Transmission values are functionally 
related to density readings by the following equation: 

1 

Density - log ( transmlsalon ) 

Theoretically, all information would be contained in either mode and 
neither would add anything new to the data. However, in practice, this 
does not hold true for two reasons: 

1. The scanner seems to saturate. The results of this saturation 
affects density measurements. It becomes difficult to differen- 
tiate between brown wheat, brown hay, harvested grains, and bare 
soil. When the sensor saturates, it gives similar readings even 
though the colors are quite different. In the use of the trans- 
mission values, correct Classification is increased but lacks com- 
plete reliability. 

2. An additional reason for the one mode preference is concerned with 
the computer operation. The computer algorithm assumes that the 
data is multivariate normal with equal covariance matrices. Cer- 
tainly if the data was multivariate normal in the measurement 
space using density values, it would not be multivariate normal 
after it had been transformed by a reciprocal of the log transforma- 
tion. Obviously, they could not both be distributed as multivariate 
normal data. Thus, it is imperative to investigate the effects of 
variables on classification groups. 

A stepwise discriminant analysis was performed on the training data in 
South Dakota. The procedure used was program BMD07M of the BMD statistical 
package. 1 / This program performs a stepwise linear discriminant analysis 
with proportional group priors on the training data. Variables are entered 
or deleted from the discriminanting set based upon an F-test of group dif- 
ferences for a particular variable. The variable that has the largest 
pairwise group F-value is the first variable entered in the discriminant 
set. This procedure was executed on all eight variables and then upon 
the subsets of variables corresponding to transmission and density scanning 
mode respectively. Some of the original nine groups were pooled to eight, 
six, and, then, four groups and the stepwise classification was performed 
on the merged groups . The mergers are as follows : 


1 / 

Biomedical Computer Programs . W. H. Dixon, Editor. Berkeley, California; 
University of California Press, 1973. 


107 


a) For the eight groups; "Harvested Grains" and "Harvested 
Row" were merged to form the classification group "Harvested. 

b) For the six groups: Hay, Pasture, and Fallow were merged 

(a grasses type of cover), in addition to the above merger. 

c) For the four groups: Corn, and Soybeans formed a group. 

Wheat, Pasture, and Harvested Grains formed a group. Plowed, 
Fallow, and Harvested Row formed a group. This merger in 
the above groups was a result of a cluster analysis performed 
on the group means. 

The option was specified for the inclusion with no deletion of variables 
at each step in performing the stepwise discriminant analysis. Thus, 
supposedly one is adding more information (in the form of more variables) 
at each step of the stepwise discriminant analysis. The results are 
astonishing as we can see in Figures 20-27 . 

Note the following: 

(1) Overall percent correct classification increases only slightly, when 

two variables are in the discriminanting set, irrespective of what 
variables are used and what the classification groups are. The conten- 
tion of C. R. RAO 2f that more variables do not necessarily mean more 
information and hence more discriminanting power is supported by the 
data. l 

(2) The addition of a particular variable influences one classification 
group greatly. For example, in Figure 20* the variable TGREEN (trans- 
mission in GREEN) has a great effect on the classification accuracy 

of wheat when combined with DRED (density red), and DGLEAR. However, 
once TRED,has entered, TGREEN' s affect is diminished by the confusion 
variable TRED, 

(3) All discriminating information is not contained in one scanning mode 
(four variable/*). For example, compare the classification curves for 
the group FALLOW in Figures 24 and 25 respectively. Fallow was cor- 
rectly classified about 65 percent of the time when scanned in density 
mode but had zero recognition in the transmission mode. 

(4) The overall classification accuracy increases as the number of groups 
is decreased , See Figures 30 and 31 . 


1 / 

Covariance Adjustment and Related Problems in .Multivariate Analysis by 
C. R. RAO in Multivariate Analysis , editor P. R. Krishnaiah, Academic Press, 
1966. 


108 


( 5 ) 


The overall classification accuracy is greater for variables measured 
in density units, and the use of the variables measured in trans- 
mission does not improve the overall classification when only four 
variables are considered. This can be seen in Figures 20, 23 » 26* 

29. 

(6) This analysis leads one to conclude that if there is interest in only 
one or perhaps several crop groups that a hierarchial (or layered) 
classifier might be the best approach to crop identification. At 
each stage of the hierarchy , a feature selection would be performed 
to maximize the particular crop or crops of interest. 

A single stage classifier with all variables used clearly would not do 
well on the major crop Wheat in Figure 20, as evidenced by the last stage 
of the stepwise discriminant analysis. 



109 


REPRODUCIBILITY OF THE 
FICURF 20 ORIGINAL PAGE TS POOR 





REPRODUCIBILITY 01 
ORIGINAL PAGE IS POOR 


103*^00 XNSDHad 

no 


VARIABLE ENTERED 












Jffil 

1 1. 

1 . ; 

T“ 

I ; 

i ; * 

JL 

i - 
I , 

■j -• 
: * 


f • 
i 




r 

1 

r 

h * 

1 

' i \ 

| 

lU'f 

1: 

' ; 

"t 

> 

1 



..... 

ii 

i ’ 


1 . 
1 

r 

f 

; i i 

Ip' 

: 

4-4 

1 

i i j . ‘ 

i l j ' 

J ' ' 



4 ! T 

; ; 1 | » 

— 1- ■ • .. : 

m ; 

r '• 

1 ’ 

-! 

4- - 


f 1 

j- 

‘ 

i • 

i 

t • ** 

j • 

J!"' 

( 

1 

1 

i. ; 

l .. * ‘ 

L 

4 


T~ 

H 
1 ' * ! 

I j 

f 
. 1 

i i ; i : 

ri 

1 — - 

- :: 

-j 

J 

i. ; i 

i 1 • 
1 . 

• 

— 


i 

! • ! 

i T 7 

r 

T-J 

1 


\ ♦ - - 

I ■ 

— 

i 

I---!- ■ 

| ; 

IIU 

ULZi 

i. 


1 1 ; 

ii 



r;~~ 

fr 

} 1 

! • ■ 

t 


U- 

N: 

; ; » 
;i j 


i 

i 

1_. 

i • , 

j-- 

! i 1 J 

ii : 

[- — - 

i 

i. ; 

! 

P' 

1 > 

. 1 j : 

' : 4. 

1 • !:l ! 

; • i ; 

; ! : 

! i 

— 

1 1 

•’ 

1 : 


::t‘ 


' 

........ 

' ' 

4 

l !j 

4 4 


_[£ 


; - 

j 




i 

- — i— 
. i . 

* i 


-r— ■ 

. . » 

♦ ' • 

' ' 



P 7 
~ 

— 

i !: 

*— 

-- 

7 

... .+ 

j 

:: 


r7- 

.... 




7 

5 

[3 


J7 

i 

: 

.. r j 

a 


H 

. 

h»"7 *4 



. * 

i;|j | 

:■!! i 

1 I* 

- rT 4- 

. ... 


M 


h 

i 

Hi 

f l 

T\ 

» \ 

i 

1 



i » • 

! * r j 

] 

— —4 


- — i 

3 0 


i . . ( 

L; :; 

f 1 

5 

7 

[j 

4 

{ 


tij 

• 4 1 

i • » • * 

! 

! 


1 

. [ 



--- 

! 

HI 



41 



; ; ji 

4 | 


i| 

4* • 

/! 

-JJ 

/ 

j[' 

jTi 


* ‘ ‘ 

\ 

V 



[1 

| J ; 

A 

1 . 

... 

j _ 
! 

— i. 
. * 1 

i , 

• r ; 

_LLI 

' | 

N 

.41 

■ ' i 
; 1 ! 

N 

t~i~; 



I! 




/; 


: 1 

‘ i 

— ~T 

j 


~ — 4 



1..U. 

i 


: ' : 

'j! 

f 


?■ | 


;7 





* ,1 
i 

: ‘ I 

l. 



11 

i 


jf 


r 

i 

j 


-] 


I 



— 1 


h 

1 4 

i 

1 



' ; -! 


: 




1 

1 

-UL 

n 

4 

1 i 1 
J 

\ 

i 4 

\; 

i 

i 1 1 

»-4- 

T] 


i . 

j 

' 4 



i 

\ 1 

Lil 


V: 

I 





4 

! 


: ; 4 


4 

1 

4 

~r 

1 

T* 

v 

\ 

V 

7 

L 

\ 

-~1h . 

' ! 


t 











Stepwise discriminant analysis, classification into eight groups, all variables, South Dakota, 1972 



... .1 | i 

4-*- -4'. 



^ r 

• i T ♦ . 

; r 

' 


i 1 

i ‘ 


j ; 

.. ,i j-ij 


' ! ! 

> -ik 



~ 

T*" 

: 

. . .! 

! 1, 


VARIABLE 










103TO03 INaDttld 
114 














VJ 

* 

T; 

■; 


.... 

" - 















Stepwise discriminant analysis, classification into six groups, density and transmission scanning mode 
South Dakota, 1972. 



VARIABLE ENTERED 











FIGURE 27 

Stepwise discriminant analysis, classification into six groups, transmission scanning mode. South Dakota, 1972 



TGREEN TCLEAR 





FIGURE 28 

Stepwise discriminant analysis, classification into six groups, density scanning mode. South Dakota, 1972 



DGREEN 












FIGURE 29 

Stepwise discriminant analysis, classification into four groups, density and transmission scanning mode 
South Dakota, 1972. 



V 1 til 

f ' "1 

\V 1 

i 

iii! 

[jji 


i; 

I 

ft:, 

4-1 

•I 

rj . 

r 1 

Tr ~ 

— t 

r 

1 

i 

» . 

s 

' '7" 

i 

471 

:1 ' ' 

T7 

J.4. 

' ■ . i ! 

t r 

i ' ’ 





h t ‘ 

* ■ 

FI 

.UJ 


ii: i 

► * 

"'t 

1 

_L 

; 

; 


TCLEAR DCLEAR DGREEN TBLUE TGREEN 











TCLEAR TRED TBLUE TGREEN 
















FIGURE 31 

Stepwise discriminant analysis, classification into four groups, density scanning mode. South Dakota, 1972 



L'j 







. 



. •* 

. . . 






. t . j 

: ‘ : 1 

i 


if! ! 







a 

f 

6 




I 






DBLUE DCLEAR DGREKN 






Hr.uiv 32— .>ver ill Muislf Icatlon Accuracv tv Sic-Hcr of Croup* and *V •*nrer«*nt **odc for Four V«I»Mm 





■ 

I 


VARIABLE ENTERED (Sequential Order) VARIABLE ENTERED (Sequential Order) 


4 * 2 Crop Acreage Estimation 

The objective of this section is to present a procedure that will use * 
classification resuits So produce an area acreage estimate. The regres- 
sion technique presented may not be appropriate for users with different 
ground data. This technique requires that a random subsample of the 
total of all segments be selected for ground observations. 

It is assumed that classification errors will be substantial, that is • 
perfect classification is not possible, and unbiased classification is 
not probable. Unbiased classification means more than that the classJ- 
cation errors simply balance. It means that the prior probabilities 
used are correct and the data are multivariate normal. 

If unbiased classification were possible, we could use pixel counting 
techniques as estimators. 

We know that the prior information was not exact and further that the 
data are not multivariate normal. Some delicate adjustments are 
necessary to produce an unbiased estimator and in order to make this 

adjustment, we will use the fact that a random subsample of segments has 
been selected for ground observations. segments has 

The first step is to estimate the linear relationship between total croD 

come S from S** 1 CX ° 5 plxe J s lns±de the segment. This information must ? 
W h r ^ he / r r d trUth se S ments and the relationship must be applied 

examnle 1 ITt Jt * not select ed for ground observations. aT 
example of how the procedure would work follows. It turns out to be 

are^stahl? 8 h ^ estiinat:e8 are P oor because the relationships that 
established in the ground observation segments do not represent the 
population that is being estimated. represent the 

This data came from the Southwest Crop Reporting District in Kansas. 

« SSlSlr SqUar€d <r2) b6t " e “ ‘he “ene ° f ‘merest 


ro! between acres on the ground and points classified cor- 
segment bLis. ““ ° n the 8r ° Und area Can be ea t a bHshed on a per 


123 


? — — 

Table 66 — Source, r , Y, X, Var(Y), Cov(XY), and Var(X). 


Source 

2 

r 

Y 

X 

Var (Y) 

Cov(XY) 

Var (X) 

Total acres (Y) 
versus 

total pixels (X) 

.95 

1843 

1841 

2,401,627 

2,716,190 

3,242,228 

Alfalfa acres (Y) 
versus 

alfalfa pixels (X) 

.01 

39 

223 

7,187 

-2,417 

9,302 

Pasture acres (Y) 
versus 

pasture pixels (X) 

.89 

728 

890 

1,467,689 

1,325,965 

1,348,245 

Corn acres (Y) 
versus 

corn pixels (X) 

.76 

145 

69 

61,931 

• 

23,668 

11,850 

G. Sorghum acres (Y) 
versus 

G. Sorghum pixels (X) 

.53 

171 

404 

70,505 

115,948 

656,917 


The model that will be used to represent the relationship is: 

y‘i - ?! + \ <W * - x Bample ± ) 

where y i is the adjusted acreage estimate for the i th crop. 

— — 

y i is the average number of acres of the i crop in the selected 
segments. - 

is the regression coefficient for the i th crop estimated by: 

N 

* x u y u 

_ = cov(xy) 
n . var (x) 


where X t otal i ls avera 8 e number of pixels of i th crop in all segments in 
a county. 


124 



X sample i 


is the average number of 
for the crop. 


pixels in the selected sample 


The estimator is the adjusted average number of acres in the average 

segment. To get an estimate of the total y^, would be multiplied by 
the total number of segments in the population (N). 

The error of the regression estimator is written as: 

v ar (Y) - S yi <1 ’ r ^ 
n 


where Var(Y^) is the variance of the final adjusted estimator of the 
average segment of the i th crop. 

Sy A is the adjusted between segment sums of squares for the I** 1 
crop. 

2 

r is the correlation coefficient squared between the number of 
acres in the segment and the computer classified number of 
pixels in the segments for the ith crop. 

n is the number of degrees of freedom in the estimator. 

Since the estimator for the total number of acres in the county is N(y ), 

2 * 
the variance of the total is N times Var(y). 

The regression estimator above is the best in terms of lowest bias and 
smallest variance. Other estimators of the regression type such as, 
ratio estimators and difference estimators may be quite good in special 
cases. The regression estimator has definite advantages over the other 
two types of estimators just mentioned. 

In Stevens County, Kansas, each pixel was classified. There were 410,505 
pixels in the county and 468,000 acres. Each pixel represents 1.1401 ■ 
acres. Actually, the county boundaries were approximated and this intro- 
duces a small amount of error. Out of the total of 410,505 pixels, the 
following pixels were classified as: 


1.) 

Alfalfa 

5,362 

5 . ) Other 

37,567 

2.) 

Pasture 

172,021 



3.) 

Corn 

30,448 



4.) 

Grain Sorghum 

165,107 




125 


The first step is to put these pixels into a per segment basis. There 
were 280 segments in the county so the average segment contains 1,466 
pixels for all land uses. The other averages were: 


1 . 

Alfalfa 

19.2 

2. 

Pasture 

614.1 

3. 

Corn 

108.7 

4. 

Grain Sorghum 

590.0 

5. 

Other 

134.0 


Since the relationship between alfalfa acres and alfalfa pixels is 
quite poor, we shall demonstrate the procedure using pasture data. 


The pasture acreage estimate for Stevens County using ERTS data is: 


^pasture “ 430 + • 9835 < 614 * 714 > * 332 
Y ■ *(280) (332)«92,960 acres for Stevens County. 


„„ (y - > . (1467,689) (4) (l-.89)(280) 2 

v acres 7 4(5) 


3,164,337,484. 


Standard Error * 56,252.4 C.V. * 60.5 


The estimate and variance without using LANDSAT data are 120,400, and 
23,013,363,520, respectively: 

where V(y) - - ^ 4 | 7 j — (28Q) 2 - 23,013,363,520 
. „ .. 151.702 , 

and c - v - ■ tmIsso * l26% 

Table 67 shows acreage estimates with variance and coefficients of varia- 
tion for various crops with the aid of LANDSAT data. 

Table 68 shows acreage estimates, variances, and C.V.’s for Stevens 
County, disregarding LANDSAT data. 

The first point is that the variances of the estimates that use LANDSAT 
depend on the variance of the ground observations, the correlation of 
LANDSAT data with ground observations and the sample size. If the corre- 
lation is very high as vHth pasture, it is possible to produce an accurate 
estimate only if the ground observation is accurate. For example, no 
alfalfa was observed in the ground truth segments. Even though the com- 


126 


puter was trained with alfalfa from outside the county and 5262 pixels 
were classified into the alfalfa category for Stevens County, the rela- 
tionship was bad, and the ground observations were poor, and therefore, 
the estimate is bad and the C.V. very large. 

These estimates and estimates of the variance were computed for two 
sample sizes. There were really three segments in Stevens County, and 
one of those was not used because of location problems. These numbers 
used the two segments left in Stevens County, the relationship for all 
17 segments, and the total Stevens Company Classification data. However, 
variances and C.V, 's were figured for samples of size 5 and 10. 

total aircraft classification were available for the same area, the 
model would be as follows: 


y - y + b x (Xj - + b 2 (£, - xj) 

The variance would be similar to the previous formula: 


Var(y) 


S 2 (1-R 2 ) 

y 

n 


where R is the multiple correlation coefficient squared and n is the 
number of degrees of freedom left in the estimator. 



127 






JL 


•'-'V 


60 

c 

*rl 

VI 

a 


T3 

c 

eg 

m 

4 s 

o 

cn 

4) 

N 


a 

E 

(0 

CO 

u 

o 


a 

o 


n 

> 


o 

CO 

u 

c 

41 

o 

sH 

UH 

4-4 

0) 

o 

o 



if 


«t 


1 

4H \ 

V 

_ .y 

o 

rH 


o 


T3 



*8 ** 

§ 



n n rs 


C h> 
4>« 

f \ 8 oo cn cp 

«o 


•php 

Oh 

V " r*J* 

41 

N 

■H 

CO 

tH« 


hi 

4H> 


CO 

C 

UH 


4) 

I 

0) 

o 

■J 


4h 

O 

OJ 

CO 

• •••1 


CO 

4J 

o 

pH 

4H 

4) 

Ov*» Osf 
H OS N CSl 

c 

8 < 

& 

4) V 

CO 

O 

so o> cn oo 

o 

C 

***•». 


(0 

CM St Ov VO 

41 

•H 

oo en cm os 

4> 

•r| r 

& 

M 

(8 

nn mrs 
* a * • 

1 

cn 

> 

m m so os 
m so pH m 

s* 

1 


CM pH CM 

10 o 



H 

hi 




o 




Uh 


4H 

O 


c 

o 

*tH 


c 

coo 

»{ 

hi 

v eg 

^H 


4>H 

rH >t' rH 



es-u 

8 • • • 

nJ 

CO 

a eg 

•rfp< 

os m 

in pH i-H 

> 

hi 

on 


c 

•Heg 



i 

4) 

oeff 

V 


CO t 

y « 

CO 

LJ 


a) 1 

n 



•H Q, 

o 3 

4H 

o 


00 00 OS 00 

■H 

4h u- 

IU 

or 

r< 

u. 

4) 

O 

<3 

cn rs cn st 
CM Os ts so 

p p p p 

m os oo en 

0> l»- 

o • o 

U CO 

B 

eg 

vo so m os 


eg 

A 

*H 

M 

msr o in | 

' • •* #4 * 

co *a o 

» 6 

rj H S 


eg 

rH pH en O' 


> 

pH Cn CM pH 

pH m cm in 

C <3 cn 

eg cn 



CM 

•H Q 
>h 25 




eg < 

»• M 



> 




p UH 

CO o 

0) 

oo 

a> 

U 

O OHOv 

4) 

*J T3 

tO 

td 

SO SO 00 

B eg 

Q> 

B 

os r«» so 

w *n 
O hi 
< CO 

w 

* p p p 

CM 00 O 
os ts m 
pH 

hi 4) 

CO .d 

41 hi - 

• M 

•• 


a) h> 
oo a 



ft • ft ft 

efl o 



• ft ft ft 

• • • B 

Oi J= 
(H hi 



• • • fl 

Ci sH 



• • • 43 

• • • 60 

f 55 N 

a 


• • • M 

• • • O 

cb 

o 


ft 4) • cn 

VO 

u 


HH hi • 

u 


h a • a 

<u 



ft 4J fl sH 

rH 



4h co C eg 
H eg o hi 

■s 



< cn o o 

H 


IDO 

hi-H 

Chi 

,4>«0 

CJhi 

' ffi 

a> 

tS 


c ! 

no 


C-u 

aieg 


on 
, # 

ftf* 

a) 

icS 


NK°K 
H VO H 
• • • 
Os ci o 
oo co >-i 


o o o o 

OO VO vf o 
ONON 

■ • •* « p 

v£) H Ov Ov 

•v oo o m 
co.so to rs 

» rv » • 

»o vo m n 
in o 03 n 
m sr m 


o vo n 
• • • 
v£> r* st 
N >? H 


<U 4) 


(0 

w 


a. 

o 

n 

o 


o o 

lO ON 
»h m 
* * 
cm cn 
OS SO 
so cn 


o o 

00 O 

© St 

p p 
00 00 
ts pH 

o m 


cm cn rH ir> 
rl r) fs O 
H O Ov H 

p p 

CO pH 
CM 


o o o o 

O N >} 

st in oo 

p * p 

O m pH 

CM SO CM 

H cn 


eg a) 

4H U 

*h => 
CD 4 J 
4H CO 


• B 

• a 

• xt 

• 00 

• hi 

• o 

« cn 

• 

• c 

gs 

O >4 


huo 


128 




V. Cost Analysis 


This section is presented to provide cost information relative to 
various sources of data collection. It is documented so that as 
technology is improved, the cost of developing an Integrated data 
collection system can be realistically evaluated. 

However, the cost data cited reflects only the conditions under 
which this project was completed. It is to be expected that new 
technology will change some of these costs in the future. 

Cost of Ground Data 

The cost of ground data can be broken into collection costs and 
summarization costs. The data collection costs Include: 

a. pre-survey planning and materials preparations, 

b. enumerator training schools, and 

c. enumerator fieldwork. 

The summarization costs include: 

a. collection, edit, and keypunch time for Washington, D.C. 
and State Statistical Office personnel, and 

b. programming and summarization costs. These costs pertain 
to all four test sites and are as follows: 

5 . L Data Collection 

1) Survey Planning and Materials Preparation 
Research and Development 

Salaries $1,342.00 

Travel costs (map preparation salaries) 263.67 

Programming Costs 
Salaries 
Computer costs 


2) Enumerator Training Schools 
Instructors 
Salaries 
Travel 

Enumerators 

Salaries 530.00 

Travel 210.00 

$ 2,119.28 


849.59 

1.259.11 

$ 3,714.37 

• 901.84 

477.44 


129 


3) 


Enumerators Fieldwork 
Salaries 
Travel 

Total Data Collection Costs 


$4,931.65 

3,044.00 

7,975.65 

$13,809.30 


5.2 


Data Su 


rization 


1) Collection Edit and Keypunch Costs 

SSO Salaries 

Research and Development Salaries 

2) Programming and Summarization Costs 

Salaries 
Computer Costs 

Total Data Summarization Costs 


2,524.05 

6,816.68 

$ 9,340.73 

2,632.80 

1.281,49 

$ 3,914.29 
$13.255.02 


Total LANDSAT Ground Truth 


$27,064.32 


It should be noted that the above cost data are for the update work 
conducted in August, September, and October. The costs of the regular 
June Enumeration Survey (JES) are not comparable since in addition to 
observing and recording ground cover, the JES records crop intentions 
and livestock numbers. Estimates of these costs can be derived, how- 
ever, by using enumerator time and mileage costs. Mileage rates and 
hourly wages applied against the miles driven and hours worked toge- 
ther give a total cost estimate by segment. This comparison follows: 


5.3 JES Fieldwork Costs 


A. Time 






District 

State 

Time 

#Seg8. 

$/hour 


9 

Missouri 

6.42 hr/seg. 

52 

$3.30 

$1,101.67 

6 

S. Dakota 

4.80 hr/seg. 

50 

3.30 

792.00 

7 

Kansas 

8. 93. hr/seg. 

48 

3.30 

1,414.51 

2 

Idaho 

5.75 hr/seg. 

44 

3.30 

834.90 


Total 


194 


$4,143.08 


Time cost 

per segment ■ 

$21.36 





130 


B. Mileage 


District 

State 

Miles 

#Segs. 

$/m£le 


9 

Missouri 

99.98 m/seg. 

52 

.11 

$ 571.89 

6 

S. Dakota 

80.86 m/seg. 

50 

jt 4** 1 

444.73 

7 

Kansas 

136.81 m/seg. 

48 

.11 

722.36 

2 

Idaho 

82.85 m/seg. 

44 

.11 

400.99 


Total Mileage Cost 



$2,139.97 






$6,283.05 


Mileage cost per segment 

$11. 

03 



C. Total Time and Mileage 


Total time and mileage cost/segment $32.39 
5.4 Update Fieldwork costs (3 visits) 


A. Salaries 

B. Travel 

C. Total Time and Mileage 
($7,975. 65/3«$2, 658. 55) 


4,931.65 

$3.044.00 

$7,975.65 


Total update time and mileage costs per segment $41.11 

Total update time and mileage costs per segment per visit $13.70 


The difference between $6,283.00 and $2,658.55 represents the addi- 
tional costs of $3,624.50 needed to locate the June Segment Operators, 
secure livestock data and farm labor data. This LANDSAT update field- 
work only included locating the segments and recording the crops pre- 
sent and their conditions. The operators were not contacted unless 
the enumerator could not view the fields from the road. 


Tables 69 through 76 show detailed time and mileage data for the study 
sites. 


131 


Table 69-*— Missouri 1972 JES Tine and Mileage Data 


Diet 

Number 

Visits/ 

Hours/ 

Miles/ 


Segs 

seg 

seg 

seg 

10 

60* 

1.43 

4.90 

70.63 

20 

51* 

1.75 

4.70 

81.08 

30 

49 

2.04 

4.91 

82.78 

40 

50 

1.94 

5.30 

86.78 

50 

60* 

1.93 

5.34 

80.90 

60 

39* 

1.82 

6.95 

85.08 

70 

46 

1*32 

5.19 

61.04 

SO 

42 

1.71 

5.55 

94.52 

90 

52 

2.23 

6.42 

99.98 


*Not all segments in this district had cost data reported. 


Table 70— South Dakota 1972 JES Time and Mileage Data 


Diet 

Number 
, Segs 

Visits/ 

seg 

Hours/ 

seg 

Miles/ 

seg 

10 

31 

1.87 

7.93 

107.16 

20 

46 

1.67 

5.33 

93.20 

30 

42 

1.83 

5.74 

88.19 

40 

31 

1.90 

8.19 

122.23 

50 

42 

1.90 

6.96 

99.90 

60 

50 

k.92 

4.80 

80.86 

70 

22 

1.86 

8.32 

131.23 

80 

35 

1.63 

7.23 

101.14 

90 

51 

1.82 

4.65 

75.98 


132 


Table 

71— Kansas 1972 JES 

Time and Mileage Data 



Dlst 

Number 

Visits/ 

Hours/ 

Miles/ 


Segs 

seg 

seg 

seg 

10 

42 

2.26 

6.99 

130.52 

20 

54 

2.04 

5.67 

94.43 

30 

50* 

2,58 

5.74 

88.94 

40 

40 

2.25 

8.16 

141.90 

50 

56 

2.46 

6.81 

127.21 

60 

53* 

1.87 

4.92 

69.00 

70 

48 

2.79 

8.93 

136.81 

80 

60* 

1.93 

6.68 

105.83 

90 

53* 

1.81 

4.69 

82.34 


Table 

22— Idaho 1972 

JES Time and Mileage Data 



Dist 

Number 

Visits/ 

Hours/ 

Miles/ 


Segs 

seg 

seg 

seg 

2 

54 

1.54 

5.75 

82.85 


Table 73— Time and mileage data 

for Idaho by enumerator. 


Enumerator 

identification 

JES : 
segments : 
completed : 

Average Number: Average * 
visits per :hours per: 
segment : segment : 

Average miles : 
per : 

segment : 

6 

3 

3.00 

3.78 

104.67 

12 

8 

3.00 

8.83 

125.00 

18 

10 

2.10 

3.44 

56.70 

19 

14 

3.29 

7.41 

80.07 

30 

13 

1.54 

4.19 

80.23 

33 

6 

2.83 

5.93 

71.00 

Totals: 

54 

2.54 

5.75 

82.80 


133 


Table 74 — Tine and Mileage 


data for Missouri by enumerator. 


Enumerator ; 

JES : 

Average number: 

Average hours 

Identification : 

segments : 

. visits per : 

per 


completed : 

segment : 

segment 

1 

15 

1.33 

4.89 

2 

15 

1.47 

4.19 

3 

6 

2.50 

3.38 

4 

17 

1.76 

3.66 

5* 

6 

1.83 

7.77 

6 

9 

1.89 

7.09 

7* 

12.5 

1.68 

4.69 

8 

13 

1.77 

7.06 

9 

10 

1.40 

4.99 

10 

11 

1.73 

4.44 

11 

16 

2.12 

4.64 

12 

15.7 

1.27 

5.06 

13 

17 

1.41 

4.50 

14 

13 

2.15 

6.27 

15 

13 

1.08 

4.94 

16 

11 

1.45 

5.11 

17 

15.5 

2.58 

5.97 

18 

14 

2.43 

7.01 

19* 

14 

1.21 

3.93 

20 

11 

1.55 

6.66 

21* 

13 

2.00 

5.58 

22 

7 

3.57 

9.18 

23* 

13 

1.92 

5.47 

24 

11 

3.27 

10.27 

25 

14 

2.07 

4.79 

26 

12 

1.67 

4.60 

27 

14 

2.43 

6.70 

28 

14.3 

2.10 

4.70 

29 

9 

1.67 

8.78 

30 

16 

1.94 

5.07 

31 

9 

2.56 

6.13 

32 

18 

1.28 

4,56 

33 

9 

1.22 

4.28 

34 

10 

1.50 

5.38 

35 

17 

1.41 

3.76 

36 

8 

1.75 

6.12 

TOTALS : 

449 

1.82 

5.42 


: Average miles 

: per 

: segment 

58.53 
80.93 
87.5 

59.18 
118.33 

93.78 

66.64 

78.92 

78.10 

109.91 

78.56 

63.76 

57.53 
109.23 

56.23 

81.18 
115.10 

93.21 


51.00 
90.09 
78.46 
99.71 
87.62 

163.91 

90.36 

64.17 

102.57 

103.08 

67.67 

78.00 

114.33 

46.50 

62.11 

77.80 

65.29 

130.25 

82.22 


* Supervisors 


134 


T? OF THE 


EEPROFUCIBTI 

ORIGINAL PAui. Ih 


Enumerator 

Identification 


JES : Average number :Average Hours 
segments : visits per : per 

completed : segment : segment 

2.0 1.03 1.04 

3 1.33 5.50 

8 1.75 4.35 

13.8 1.88 6.96 

8.6 1.74 6.25 

7 1.43 8.36 


37.59 

111.33 

73.12 

151.52 

87.79 

80.14 


14 2.07 5.91 
6 1.50 4.79 
19 2.16 4.13 


10 

15 

11 

9 

12 

7 

13 * 

13 

14 

11 

15 * 

7.4 

16 

10.5 

17 

13 

18 

12 

19 

25.4 

20 

15 

21 

11 

22 

14 

23 

15 

24 * 

8.7 

25 

13.1 

26 

8.3 

27 

13 

28 

15 

29 

15 

30 

5 

31 

9 

32 

2.3 


1.93 

5.51 

1.33 

7.19 

1.86 

5.17 

1.85 

3.87 

1.45 

5.18 

1.08 

2.90 

1.43 

5.09 

2.38 

9.29 

2.50 

1 

2.24 

4.72 

1.53 

7.01 

1.73 

4.67 

1.64 

4.90 

1.60 

5.61 

1.84 

8.33 

2.14 

7.36 

2.29 

7.14 

1.54 

7.81 

2.00 

7.48 

1.80 

5.38 

1.40 

6.02 

1.56 

5.81 

2.17 

5.20 


117.03 

46.50 

80.21 

95.93 

69.78 
63.43 

79.54 
71.91 
62.03 
76.86 

144.69 

159.00 

82.72 

90.07 

71.55 
133.29 

70.33 

126.78 

129.01 
135.54 

83.85 

95.33 
77.67 
33.00 

113.67 

149.13 


TOTALS : 350 1.83 6.27 


95.91 


* Supervisors 


Table 76 — Time and mileage data for Kansas by enumerator 


Enumerator : 

identification : 

JES 

segments 

Completed 

: Average number 
: visits per 

: segment 

Average hours 
per 

segment 

Average miles 
per 

segment 

1 

8.9 

2.70 

7.15 

105.73 

2 

18 

2.00 

5.21 

105.11 

3 

15 

2.40 

5.58 

89.67 

4 

12 

2.33 

4.63 

78.17 

5 

20 

1.90 

5.99 

111.00 

7 

14 

2.71 

5.29 

98.07 

8 

9.8 

1.63 

9.29 

134.69 

9 

11 

2.09 

6.33 

69.18 

10 

9 

1.56 

7.96 

135.56 

11 

4 

1.75 

7.96 

80.00 

12* 

3.4 

2.94 

11.06 

272.94 

13 

19 

1.95 

4.82 

74.37 

15 

16.9 

2.84 

9.09 

180.65 

16 

14 

3.5 

8.78 

142.36 

17* 

4 

1.25 

6.69 

116.75 

18 

11 

1.73 

4.83 

58.64 

19 

12 

2.42 

8.38 

162.08 

20 

12 

3.25 

8.68 

132.92 

21 

14 

2.64 

5.08 

108.07 

22 

3 

1.67 

5.67 

98.67 

23 

12 

2.25 

7.85 

122.00 

24 

14 

2.71 

4.67 

90.00 

25 

16 

2.44 

4.60 

99.69 

26 

13 

1.85 

6.50 

108.77 

27* 

5 

3.00 

15.92 

260.20 

28 

16 

2.12 

5.43 

89.12 

29 

12 

1.75 

5.62 

69.25 

30 

1". 

2.08 

7.25 

117.25 

31 

1!, 

1.60 

4.76 

65.73 

32 

14 

1.21 

4.23 

69.14 

33 

9 

3.11 

7.62 

114.67 

34 

7 

3.29 

10.86 

173.86 

35 

11 

2.82 

9.20 

151.09 

36 

14 

1.64 

7.02 

93.57 

37 

11.5 

1.48 

4.85 

81.04 

38 

17.5 

2.46 

5.15 

107.43 

39 

15 

1.53 

5.29 

91.40 

40 

11 

1.73 

5.93 

54.91 

TOTALS : 

456 

2.21 

6.44 

107.10 


* Supervisors 


136 


5.2 Aircraft Cost Analysis 

NASA provided the following estimates for aircraft costs: 

U-2 operational costs are $2,150 per hour with coverage of about 
400 nautical miles per hour. Coverage is 14.8 nautical miles on 
a side per scene. 

Scenes per hour * 'J^'g =27.03 

Cost per scene ■ $2,150 * $79.63 ■ $80 

27 

For the study areas, the acquisition costs average about $60 per i 
segment. 

f 

The activities and the approximate time and costs required to prepare the 
aircraft data for crop classification are: 



Average time 
per segment 

Sketch segment and record field boundaries 

37 min. 

Microdensitometer scanning 

33 min. 

Recording and keypunching input data for 
field extraction 


Total man hours 

1.83 hours 

Cost /man hours 

$4.50 

Average cost /segment $8.23 


ADP Costs 

PDSCMS data conversion 
Field extraction 

Total ADP costs/segment 

The average cost per segment for data preparation $29,23 

The costs of crop classification varies with the size of segment, but in 
order to have a comparable cost with ground observations, it is presented 
on a per segment basis. The average cost per segment for crop classifica- 
tion was about $81 segment. The average cost per data point is about 1 3 
cents per point. 

The total aircraft survey costs were about $170 per segment. This com- 
pares with $47 per segment per visit for the ground observations. 


137 


This analysis deals primarily with the time and costs required for scan- 
ning the aerial photography and converting the data into a form suitable 
for crop classification by discriminant analysis in the Statistical Analysis 
System (SAS). 

Time and cost data were collected as follows: 

1) Pre-scan setup: the time (man minutes) required to locate the 

segment on the microdensitometer, sketch the segment, record 
field boundary coordinates define the microdensitometer scan- 
ning parameters. 

2) Scanning: the time (man minutes) required for system analog cali- 

bration and microdensitometer scanning with each of the four 
filters in density and transmission units. 

3) Data preparation for field extraction: the time required to record 

and keypunch input data for the field extraction program. 

4) PDSCMS data conversion: ADP costs for converting the microdensito- 

meter output data to SAS compatible data. 

5) Field extraction: ADP costs for assigning crop classes, tract and 

field identifiers to individual pixels on the basis of ground 
observations utilizing pixel coordinate information. 

Several factors contributed to the substantial differences between states for 
the average cost per segment. 'The differences for pre-scan setup times can 
be attributed to two primary factors: 

1) different microdensitometer operators. A new operator was in train- 
ing while scanning South Dakota, and had gained in experience when 
Missouri was scanned. 

2) the relative difficulty recording field boundary coordinates for 
each state. South Dakota and Missouri were most difficult because 
of many small field sizes, followed by Idaho, with Kansas least 
difficult. 

New field boundary coordinate recording procedures were implemented near 
the end of the Idaho scanning and were subsequently employed while scan- 
ning the Missouri photography. Due to operator differences, it is difficult 
to objectively assess the effectiveness of the new procedures. Subjectively, 
it is believed the new procedures will reduce pre-scan setup time by 10-20% 
and data preparation for field extraction by 25-40%. 


138 


Scanning time remains fairly constant between states (the large difference 
in South Dakota is attributable to a new operator in training on the 
microdensitometer) . Small differences are a function of the number of 
segments and average size of each segment. 

Between state differences in automated data processing costs are a 
function of the number of segments, average size of each segment, and the 
number of tracts and fields within each segment. 

5.5 Computer Costs 

Processing LANDSAT data and digitized aircraft data requires enormous 
amounts of computer time. The following table shows the cost of computer 
time for processing at the Washington Computer Center for various broad 
classes of processing. 


DEVELOPMENT $ 6,631 
GROUND DATA $ 2,915 
MAPS FOR SEGMENT LOCATION $ 9,227 
AIRCRAFT DATA ANALYSIS $30,142 


TOTAL $48,915 


Development costs include converting software to run at WCC, maintenance, 
developing original programs and overhead. 

Ground data costs are for building and maintaining, and summarizing of ground 
data. 

The MAPS were grey level maps of LANDSAT CCT's for segment location. 

\ 

The aircraft analysis cost includes charges for conversion of microdensi- 
tometer data, building the data, files, and runs used to determine the best 
analysis procedure, the discriminant analysis, and the combination of the 
satellite results, aircraft results, and ground data. 


139 




Balaae-'j: 


APPENDIX A 


ERTS ENUMERATORS INSTRUCTIONS 



ERTS 


1 / 

ENUMERATORS INSTRUCTIONS 


^ 1.1 What you will do: 

You a re one of about 16 enumerators in four states (Kansas, Missouri, 
South Dakota, and Idaho) employed to obtain "ground truth" about crop 
species, acres and crop condition. Briefly, your job is to update 
information from the June Enumerative Survey (JES) by verifying crop 
species and acres and observing crop condition during July, August, 
September, and October. Your field verifications and observations are 
to be recorded on the Earth Resource Technology Satellite (ERTS) Ground 
Truth Printouts. 

1.2 Equipment and Supplies 

f USDA identification card 

aerial photos 

aerial photo mailing boxes 
. county maps 
CEF-201's 

ERTS Ground Truth Printout 
large envelopes for mailing completed forms 
motor vehicle accident report kit 
ball point pen 

lead pencil, plus red, orange, and yellow colored pencils 
clipboard 
highway maps 
Julian dates. 


1.3 Mailing and survey dates 

After each survey is completed you will mail the updated printout and 
your CEF-2Ql's to the SSO in the envelopes provided. All other materia?^ 
used during the survey will be retained until the final survey period 
Your final mailing will include the updated printout, CEF-201, aerial 
photos and county maps, plus any other surplus materials. 

The survey periods and mailing dates are as listed: 


Survey period 


Enumerators mailing date 


August 7-11 
September 11-15 
October 10-13 


on or before August 11 
on or before September 15 
on or before October 13 


At the time this manual was written, LANDSAT was called ERTS, acronym for 
Earth Resources Technology Satellite. ERTS was never changed to LANDSAT because 
this manual was never used after 1972. 


REPRODUGBII JTY OF TIIR 


1 


1.4 Terms and definitions 


The regular enumerative survey definitions hold for this survey 

A. Segment - land area outlined in red on aerial photos and county maps. 
Each segment is identified by a permanently assigned 4-digit number. 

See the "Survey Enumerators Handbook" for discussion on use of aerial 
photos and locating segments. 

B. Tract - an area of land inside the segment which is under one manage- 
ment, Each tract is identified by a letter code A, B, C, etc. on the 
aerial maps and by the corresponding numeric code on the form printout 
(i.e. A * 01, B ■ 02, C - 03, etc.). Tract boundaries and letter 
codes are drawn in blue pencil inside the segment on the aerial photo. 

C. Field - a continuous area of land inside a tract which is devoted to 
one crop or land use. Each tract on the aerial photo is divided into 
fields during the enumeration of segment acreage in late May or early 
June. Fields are numbered and their boundaries outlined in red pencil. 

D. Farm Operator - the person who is responsible for the day-to-day deci- 
sions for a tract. 


Part II - The Survey 

2.1 Purpose of the survey 

The purpose of the Earth Resource Technology Satellite (ERTS) program is 
to: 

a) investigate and evaluate the use of space imagery to identify crop 
species . 

b) investigate ways of using space imagery to improve agricultural sta- 
tistics. 

Through ground truth obtained during July, August, September, and October 
we will be able to check and verify the accuracy of satellite imagery 
(500 miles) and high altitude photography (60,000 feet) as a method of 
measuring crop acres which in turn will be used to generate an expanded 
estimate of crop acreage. 


2 


Ground, high altitude, and satellite estimates of acreage will be obtained 
and compared against collection costs to indicate a cost-information ratio. 
Trained enumerators as yourselves will collect the ground Information from 
the field. Trained photo interpreters will record species and acreages 
for high altitude photography in the Washington, D.C. office. Computers 
will be used to analyze satellite imagery in the Washington, D.C. office. 
Cost information for each method of collection will be retained and com- 
pared versus the accuracy of reliability of each method of data collection. 

2.2 The sample 

The segments selected for t s survey were selected to provide different 
crops in different locations. A different mix of crops will be found in 
Idaho versus Kansas versus Missouri versus South Dakota. How do sugar 
beets compare with potatoes in Idaho or grain sorghum in Kansas? Will 
spring wheat be distinguishable versus winter wheat? Does corn in Missouri 
look the same as corn in South Dakota? The information collected will 
provide answers to these types of questions. Additionally, with the 
distant geographic areas, inclement weather should not cover all the test 
sites and limit the quality of all imagery on a particular survey. 

We use the JES segments since they represent 100% coverage of the areas 
in question. If bad weather renders some of the aerial photography or 
satellite images useless, we will attempt to develop reliable estimates 
for the other areas based on the ERTS ground truth. This may become a 
multiple frame model for acreage estimation. 

2 . 3 Survey forms 

For the second visit you will be provided a printout listing in segment 
and tract order fields, acreages and crops from the JES (first visit). On 
this visit you will note the condition of the crop on the printout. On 
succeeding visits the printouts will show fields, acreages, crop and con- 
ditions for each earlier visit. 

The name and addresses of the operators from the JES will be provided on 
separate sheets of paper grouped by segments. These will be for use 
when it is necessary to locate the tract operator for permission to view 
fields not observable from public roads. 


3 


I 


l 


1 


1 


Part III - Field Observations 


3.1 Locating the segment 

Locate the segments you will be visiting on the county maps, then plot 
them on a highway map. Plan your journey to observe these segments with 
minimum mileage and travel time for each day's journey. 

3.2 Recording observations for segment 

When the printout and maps are updated on the monthly visit, use the color 
codes listed below for field boundaries and numbers. 

Second visit - red dashes 
Third visit orange dashes 

Fourth visit yellow dashes 

Note: We will only mark corrections on the map. Incorrectly drawn fields 

will not be erased. There may be no new dates for the survey duration If 
fields are drawn correctly from the JES and there are no acreage changes. 

For this survey we are not interested in transfer of ownership etc. , except 
to know whom to contact for enumeration purposes. Our concern is enumerat- 
ing the land use and crop development condition of the segment. 

Step 1. Verify that you are looking at the correct photograph(s) and the 
correct printout by locating landmarks on the map and locating 
the segment and tract number on the printout. Record the Julian 
date on all N of N pages of the tract printout. 

Step 2. Verify that the field is drawn correctly on the maps by a) looking 
at the field defined by the map and b) deciding whether the map 
accurately shows the field with respect to common landmarks. If the 
field cannot be observed from public roads contact the tract opera- 
tors and request permission to observe the fields in question, then 
write on the bottom of the printout whether permission was secured 
or refused. (By default, the printout will write unasked unless 
permission is noted as secured or refused) . 


4 


i 

If permission is refused, record observations for fields observable 
from public roads. Enter refused (code - RFSD) for the fields 
not observable from public roads in acres, crop and condition 
columns. If the map is correct go on to Step 3. If the map is 
incorrect, redraw the fields using the correct color scheme before 
beginning Step 3. Do not erase any previous survey boundary lines. 

Step 3. For the given field number, check the acres listed on the printout 
versus the map and your own best estimate of the actual field 
acreage in whole numbers. If the acres are the same as the pre- 
vious visit check (vO the space for acres . Where a correction 
is necessary (i.e. — — an error has beem made or an obvious change 
in acreage has occurred since the previous visit) check with the 
operator for the corrected acres or record your own acreage obser- 
vations where checking is not possible and write a note explaining 
the change. See Figure 1 for examples of corrections and changes. 

Step 4. If the crop is the same as the previous visit check (/) the crop 
code. If a change has occurred record the corrected crop code 
for that field. 

Step 5. Using the guide from Section 3.3, write the condition of the crop 
on the printout in the space provided. Write a note to explain 
any situation or our condition codes do not accuractly describe. 

Step 6. Repeat steps 1-5 until all fields are completed, then check that 1 

all N of N pages for a tract listing are present and complete. 

Step 7. Repeat steps 1-6 until all tracts and segments are completed. 

3.3 Crop codes and conditions 

Since we will be looking at aerial photography and satellite imagery, we 
need to know the crop species and the condition code that best described 
and coded appropriately on the printout. In order to code the condition 
properly you must observe the total area in the field which would be covered 
b,y the crop and Chen give a subjective evaluation of the crop development 
as well as a recommendation for action to be taken. 


5 


Crop 

Alfalfa 


Situation 


What to do 


All crops 


Applies to most 


Grass waterways 


Drowned out areas 


part down, part cut 


very poor stand, 
dry etc. 


two similar species 
but different planting 
dates 


located on natural 
boundaries or can 
accurately be drawn 
on the map 


located near natural 
boundaries or can 
accurately be drawn 
on. the map 


condition ■ cut or down, 
whichever portion is 
the larger. 


condition - what a 
normal healthy crop 
would look like with 
a note. 


draw in new fields 
boundaries and pro- 
perly number and 
classify the new 
field. 


draw in new field and 
classify as OTHR and 
specify on printout 
with a note. 


draw in new field and 
classify as OTHR and 
specify on printout 
and a note. 


6 


ERTS Editing Instructions 


1. Edit the Julian date to correspond to the actual field visit. 

2 ‘ ^r« aCreS f ° r “ glTen £ield versus the previous recorded 


A. Do not edit where column is checked (/) . 

B. Dn case the acres differ from the previous visit: 


11 

-.IS s. 


2) ™tr.ho V/° r ?“ urred °» the previous enumeration, an enumerators 
note should explain the correction. With an explanation ^ 

correction will be punched. With no explanation talking ’to the 

“r™c£«™s“ atlatiCS JU< “ ^^approprt.^du L 


c. 


Check (✓) the acres column where the 
field not observable. 


tract was a refusal and the 


3. Check the crop codes for correctness c 


A. Do not edit where the column is checked (/) . 

B. Edit out where they are unrecognizable. 


C. 


Correct the code when it is a 
incorrectly written. 


change from the previous month and 


4. 


D. Enter RFSD where the trace was a refusal and the field not observable 
Check the condition code against nearby fields. 


A. 


Edit to compare with fields in the 
tion is not entered. 


tract or segment where the condi- 


B. 


C. 




Edit cut where condition is the 
NOTE: On the first visit there 


sure as the previous month, 
must be an entry for every field. 


7 


0. Enter RFSD where the tract wee e refusal end the field Is not observable. 
5. Code the permission 

A. Unasked * 0 

B. Secured ■ 1 

C. Refusal ■ 2 

ERTS Mailing Instructions 

Send the edited printout and the punched cards Air Mail Special Delivery in 
"Special C" envelopes and "Special C" card mailer. 

Each envelope and card mailer should be marked in the lower left hand corner 
as follows: 

REPORT: ERTS Ground Truth 

STATE: Your State (99) (1 of 2) 

Secure each envelope and card mailer with a strand of filament tape each 
way around the envelope and card mailer. 

Send the aerial photos as follows: 

Research and Development Branch 

SRS of USDA 

Room 4837 South Bldg. 

Washington, D.C. 20250 


8 


APPENDIX B 


State Office Instruments 


TERMS AND DEFINITIONS USED FOR JUNE ENUMERATIVE SURVEY (JES) 

AND ERTS FIELDWORK 


A. SAMPLE: 


Information for the ERTS Survey is obtained from a small sampling of the 
total land area in four States. Small are;<s of land have been selected 
at random for this survey. Each area to be- enumerated has been outlined 
in red on the county highway maps and aerial photographs which you are 
supplied. Every acre has one and only one chance to be selected in the 
sample. 

B. SEGMENT : 

Segments are land areas outlined in red on aerial photos. Segments gen- 
erally range in size from one-half square nile to three square miles. A 
few are larger or smaller depending on locations. Segments are identi- 
fied by a permanently assigned number. 

C. TRACT: 


A TRACT is an area of land inside the segment which is under one operation. 
This tract may consist of agricultural land, non— agricultural land, resi- 
dential areas, or some other land use. Examples of tracts are as follows: 


( 1 ) 


( 2 ) 


An occupied house and land in 
segment operated by the person 
in charge. Examples are Tracts 
A and B. Notice that Tract A 
has land at two locations in 
segment . 

Segment 

3189 



A farm operator living in the segment on a dwelling where he is not 
the person in charge, and who has no other land in the segment. See 

F S above. K an the land he operates is outside the segment, he is 
still a resident farm operator and should be assigned 1.0 acres of 
land for the tract. He lives in this dwelling. 


(3) Any area of land in the segment under one operator who does not 
reside inside the segment. Tracts D and’ F are examples. 


i 




1 



The boundaries of each tract will 
be outlined in blue pencil and 
each tract will be Identified by 
a code letter. If a tract consists 
of more than one separate parcel of 
land, all parcels will be Identified 
with the same letter; i.e., all of the 
land ins Ide the segment that is 
operated by one person will be reported 
under one tract code. 

D. FIELD: 

A field is a continuous area of land inside a tract which is devoted to 
one crop or land use. Each tract will be divided into fields by you 
during this Survey. Each field will be outlined in red pencil and assigned 
a number. 

E. FARM : 

A farm consists of the area or areas of land both inside and outside of 
the segment boundaries under one management on which there were crops, 
livestock, poultry, or some sales of agricultural products at some time 
in 1971 or 1972. 

F. OPERATOR : 

The OPERATOR is the person who is responsible for the day-to-day decisions 
for the tract and total land operated. 

If the tract contains a farming operation, the operator could be the owner, 
hired manager, cash tenant, or sharetenant. If a person operates farmland 
as a hired manager or partner and also operates land for himself as a 
separate farm, the managed or partnership land should be separated and 
assigned another tract code. 

If the land is rented to others or worked by others on shares, the tenant 
or renter is considered the operator of the rented land. 



REPRODUCIBILITY OF THi 

ORIGINAL PAGE IE POOR 


2 


3 !: UNITED STATES DEPARTMENT OF AGRICULTURE 

IE u Statistical Reporting Service 
P ! 0- M. B. NUMBER 40-R2766 

! Approval Expires 4/30/74 I^BQj 

i \ 

.PART A - 3 (Mo. ) 


Slate District Segment Tract 


JUNE 1972 ACREAGE, 
LIVESTOCK & LABOR 

ENUMERATIVE SURVEY 


Lise this questionnaire only if operator lives INSIDE the SEGMENT. 

Facts about your farm or ranch will be kept CONFIDENTIAL and used 
only in combination with similar reports from other producers. 


\ 



SEGMENT NUMBER 


TRACT CODE LETTER: 


NAME OF RESIDENT OPERATOR: 

ADDRESS: 

(Route or Street) 


(City) (State) (Z^p) 

TELEPHONE NUMBER: COUNTY: 

NAME OF FARM: 

DATE: . 

Name and Address of PARTNERS: 

(Rec o rd partnership operations as a separate tract.) 

1. How many acres are inside these boundaries 

drawn on the photo (or map)? 

Z. Will any acres INSIDE these boundaries be IRRIGATED during 1972? 


YES ( ) Ask irrigation questions 

NO ( ) Skip imitation questions 



3 




SECTIUh A - ACREAGES CF FIELDS AND CROPS IN TRACT 


FIELD KL’MEEr... 


TOTAL ACRES }\ FIELD 


2 CHitt* OR LAND CM 'Specify'. 


*» CROPS HARUSTOi rHOY THIS Hf7.D'> 


a. H. is (his I i rid h«**>n plant t>tP 


v. actls [rr i n #.Ttr> v«r> 76 p.e irrigated? 


F ARMSTE AD, ROADS, MICHES. HOODS, 



r,TC 


FASTER L 


« INTER HtfEAT 


Permanent - Hot >n Crcp 

- — Eru.icn, . 

Crop land - Used Only foi 
Peilu'e 


OTHER CROPS 
A c r e t p I a n t a d o r i n u « » 


CHOPS PLANTED FOR SO 


IDLE CROPLAND - Idle all during 1972 


N. ( ) 


Y»s { 

) 

N. ( ) 


Yps ( 

) 
















































o 


Special Keypunch Instructions 



1. Punch 76 in column 1-2 for all cards. 

o 

2. Face page: Punch identification as appears on face page upper right 

hand corner. 

3. Page 2 

a) Punch field numbers as they appear at the top of page. 

b) Item 5: Leave crop code blank and punch acres 'as is.' 

Item 6-9: Punch code and acres 'as is.' 

Item 10-21: Punch code and planted acres only. 

Item 22: Skip. 

Item 23-39: Punch code and acres 'as is.' 

4. Page 3 on 

a) Punch field numbers as they appear at the top of page. 

b) Other Land: Leave crop code blank and punch acres 'as is. ' 

Permanent Pasture and Cropland Pasture: Punch crop code and acres 

'as is.' 

After Cropland Pasture through Sorghum for Grain: Punch crop code and 

planted acres only. 

Alfalfa Hay through Idle Cropland: Punch crop code and acres 'as is.' 

5. Verify. 

Note: a) Punch acres to one decimal without the decimal point. 

b) Right justify and punch lead zeros for all numbers. 

c) There will be only one code and one acreage figure punched per 
field number. The proper code and acreage will always be the 
first entry in a column for any field number. 


5 


ERTS SSO Keypunch Instructions 

1. Do not punch blanks or edited out data fields. 

2. Punch only current survey data. 

3. Punch permission code only on the first card. 

4. Punch the first four alpha characters of the recorded condition. 


6 


APPENDIX C 


Grey-Scale Map Computer Program 


WMAP 


This program will: 

1. Map directly from LANDSAT MSS Bulk data tapes (either the original) 
non- labeled tapes or standard label copies) . 

2. Map from any one of the four MSS response bands (LANDSAT channels 4, 5 
6, or 7). 

3. Compute a histogram of a sample of a designated area and compute grey 
scale boundaries for the mapping from this histogram. 1/ The user 
may specify as many as 16 grey scale classes. The user may also 

specify: 

a. as to whether or not the program will assign boundaries so 
that each grey scale class will contain about the same 
number of data points, 

b. If the number of data points in a class will be proportional 
to the square roots of the percentage distributions of the 
different response levels found, or 

c. that the program use limits which are defined by the user. 

4. For very large areas, will map about 14,000 characters a second (CPU 
time) on the WCC IBM 370-168. For smaller areas, e.g. 100 lines and 
100 columns the mapping rate decreases to around 5,000 characters per 
second. 

The USER MUST: 

1. Specify the response band to be mapped (default is LANDSAT Band 5. 

2. Specify the number (k) of grey scale divisions to be used in the map- 
ping (default is 9) . 

3. Specify a printable character for each grey scale division. 

4. Specify the location of the areas to be sampled forth® frequency tabu 
lation and/or to be mapped. 


“ If t he total number of data points in the sample area is less than 
10,000, then the histogram will include all data points in the designated area 


REPRODUCE 

ORIGINAL PAC 


Control cards required for each run are: 

1. A CLASS card which will define: 

a. the response band to be mapped (punch in column 14). 

b. the number (k<16) of grey-levels to be used in the mapping (punch 
in columns 15-16, right justified). 

c. the string of printable characters to be used in the mapping 
(punch these in consecutive one-column fields starting with 
column 18. The first character will be. used for the lowest 
level set of response values. Blanks in the string will cause a 
blank to be printed for that level(s) on the map). 

If mapping in LANDSAT band 7 (MSS channel 4) , any data points having 
values of 1 to 5 (deep water) and 6 to 9 (shallow water) will be 
assigned the characters punched in columns 18 and 19 of the class 
card. Therefore, when mapping in band 7, the user should specify 
(k+2) printable characters on the CLASS card. 

2. At least 1 SAMPLE/MAP AREA block card. 

A SAMPLE AREA card defines an area on the tape which is to be sampled 
for the frequency distribution to be used in determining the class 
levels for the printout. The first card after the CLASS card will 
always be treated as a SAMPLE AREA card. Any later card which has a 
» 1 » punched in column 20 will also be used as a SAMPLE AREA card. Any 

SAMPLE AREA card which has a *1’ in column 24 will also be treated as 

a MAP AREA card. 

The format for the SAMPLE AREA card is : 

C.C. 

1-4 the number of scan lines to be skipped. 

5-8 the number of the last scar, line in the desired area. 

9-12 the number of data points to the left of the desired area. 

1 / 

13-16 the number of the last data point to be included in the 
desired area. 


"" Columns should always be numbered in conformance with the LARS System 
whereby data points 1-804 are on tape 1, 805-1614 on tape 2, 1615-2424 on tape 
3, and 2425-3228 on tape 4. 


2 


20 


a '1' (optional if first SAMPLE AREA card). 

a 1 (optional, to be used only if class limits are to be 
assigned by means of the square root transformation). 

The MAP AREA card defines an area for which a grey-scale printout is 
to be produced. As with the SAMPLE AREA card, a single MAP AREA card 
can define an area as large as the tape itself (1/4 of an LANDSAT frame) 
or anything smaller. However, the output will be in 120 column strips. 
The format for the MAP AREA card will be the same as for the SAMPLE 
AREA card EXCEPT that: 

1. A ’1' is also punched in column 24 (optional unless a '1’ 
has been punched in column 20). 

A *1' in column 28 indicates that the user has inserted a 
'LIMITS' card after that MAP AREA card. 

LIMITS Card 

The LIMITS card enables the user to specify the class limits to be used 
for a particular map area, regardless of the values computed by the program. 
The values established by a LIMITS card will continue to be used for suc- 
ceeding map areas until the next LIMITS card is read. 

The values to be punched on the LIMITS card will be the upper boundaries of 
the k grey-scale divisions. They are to be punched in consecutive four 
digit integer fields, from smallest to largest. 


3 


JCL and Control Card Sequence 
for Program WMAP in USDA Washington Computer Center 


Label parameter is for non-labeled LANDSAT tape 


r 

as many additional SAMPLE and/or MAP area cards as desired 
■ ■ ■ 1 — 

initial SAMPLE area card, may also be a MAP area card 
a CLASS card 
/ /GO. SYSIN DD * 

/ /GO.FTlOFOOl DD SYSOUT=(c , ,8431) ,DCB=RECFM~FBA 
/ /G ( .FT09F001 DD SYSOUT=(c, ,8431) ,DCB=RECFM=FBA 

// BLKSIZE=3320) , LABEL* (, NL, , IN) , BOLDER- 6 ,DSN« 

/ /GO. FT08F001 DD UNIT=2400 .DISP- (OLD, PASS) ,DCB=(RECFM-U, 

// EXEC RADLGO , P « RADMAP — — 


job cards 


BSESSRS* 


4 



APPENDIX D 

Detailed Instructions for 
Microdensitometer Scanning of Aerial Photography 



1. Load aerial photography on manual film transport and locate frame contain- 
ing desired segment. 

2. Be certain there is a frame gap (gap between two adjacent frames) within 
the stage travel limits. Calibration is to be performed on this gap. 

3. Rotate the stage to align the scanning axes with section lines and/or major 
roads. 

4. Determine the easternmost point, relative to the stage, that is to be 
scanned. Move the stage until the vertical cross hair on the viewing screen 
is aligned on that point* Set X to zero on the Digital Coordinate Readout 
System (DCRS). 

5. Determine the northernmost point, relative to the stage, that is to be 
scanned. Move the stage until the horizontal crosshair on the viewing 
screen is aligned on that point. Set Y to zero on the DCRS. 

6. Advance teletype paper to beginning of next page. With teletype on "LOCAL," 
enter segment number and the words "corner coordinates." 

7. Return teletype on "ON LINE" and enter "E" command. 

8. Record field boundary coordinates as follows: 

a. Sequentially number all corner points of fields and other field boundary 
points as needed on the sketch of the segment. 

b. With the microdensitometer in "AUTO" operation model and the stage control 
motors off, manually move the stage to field boundary point 1. Press 
"INIT" button to record the coordinates of that point on the teletype 
terminal. 

c. Do (B) for every field boundary point sequentially. 

d. With teletype on "LOCAL," record the time required to record field 
boundary coordinates. 

9. Advance teletype paper to beginning of next page. With teletype on "LOCAL" 
record the segment number, Julian date of photography, and magnetic tape 
file numbers for the next 8 files. 

10. Locate the lightest area on the clock in the margin between frames contained 
in the fraae gap and record the coordinates of this point. This will be the 
calibration point for all scans on this segment. 


1 


tt£5PROI)UCI8IUri OF ' iIIB 
ORIGINAL PAGE IB POOR 


11. Enter "Ul" command after teletype has been placed In "ON LINE" mode. The 
"Ul" command allows users to define scanning parameters after being pro- 
moted by the computer as follows: 


Prompt 


User Response 


YDIR 


DELTAX 


XTRAV 


YSTEP 


NO. SCANS 


SCAN TYPE 


SPEED 


BBACKUP? 


where 14 ■ space bar 


x * max x 4 


y « max y 


+ 1 


Where (x^y^) are field boundary coordinates 
displayed on the DCRS in macrons. 


12. Enter "I" command on teletype to record identification information for this 
segment in the following format: 

Col. 1 - -"C", "R", "g”* or "B" corresponding to color filter in use. 

Col. 2-5 - segment identification number 

Col. 6-8 - three letter state abbreviation 


Col. 9-11 - Julian date of photography 

Col. 12-14 - "DEN" or "TRA" corresponding to scanning mode (density or per- 
cent transmission) 

13. Perform analog system calibration (Section 4. 5. 3. 3, Microdensitometer dera- 
tion Manual, P. 4-18) at the calibration point (See step 10). 


2 


] 


] 


14. Enter "CS" command on teletype. Make sure stage control motors are turned 
on. 

15. Change color filter and repeat steps 12-14 until the segment has been 
scanned with all four color filters. 

16. With teletype on "LOCAL, " record time required to scan the segment. 


APPENDIX. E 


A PROGRAM TO CONVERT PDS MICRODENSITOMETER 
SCAN LINES INTO SAS COMPATIBLE OBSERVATIONS 


This program is designed to convert a Photometric Data System (PDS) microden- 
sitometer scan into a Statistical Analysis System (SAS) compatible multivariate 
observation. Up to 4 scans of the same area may be included in the SAS obser- 
vation. 

The user controls the number of scans (normally 1 for each filter) to be used 
in building the multivariate observations. The microdensitometer scans are 
read in serially and saved on temporary files. After all the data for a given 
picture section (pisect) has been read in, the temporary files are rewound and 
read back a line at a time, and a SAS observation produced for each point in 
the line. Each observation consists of data from corresponding points from 
all scans used. 

The program is divided into 3 phases: (1) parameter phase, (2) read phase, 
and (3) combine phase. The normal operation of the program is to go from 
phase 1, to phase 2, to phase 3, and repeat as desired. 


Parameter phase: 

Allows the user to define the initial settings from all counters, and 
indicators used during the read and combine phases. If fatal errors occur 
during the run, control reverts to the parameter phase for an error scan of 
all remaining control cards, but no data will be processed. 

Read phase: 

During the read phase,microdensitoraeter scans’; are read in and stored on 
temporary files. During this process, the PDS 9-track format is converted 
to a 8 bit internal IBM notation. If the data were scanned in a raster 
or right edge scan, it would be converted to a left edge scan. The user, 
however, may elect to cancel this option and accept the data in the order 
scanned. While in read phase, all parameter definition cards are ignored. 
If an attempt is made to read more than 4 scans, the combine phase is 
automatically entered. 

Combine phase: 

This phase combines the results of the read phase. Corresponding points 
from each read file are included in each SAS observation produced. The 
daca from the reads are put in correspondence with the data items in the 
SAS observation set. If these are fewer than 4 scans to be combined, the 
trailing data items are assigned the missing value. The coordinate values 
and pixel serial numbers are computed and assigned as each observation is 
produced. At the conclusion of this phase, control reverts to the parameter 
phase, and new parameter settings will be accepted. 


/ 


REPRODUCIBILITY OP THE 
1 ORIGINAL PAGE IS POO* 




NUMERIC VALUE REPRESENTATION 

The microdensitometer output is a dieital * 

to a logarithmic converter before eoine tn H, 0 a 4 ■, s 1S rirst se lC 

to the A/D converter. 8 panel dispXa y meter and th m 

volta^iZt'-xt^ f 81 ^ 1 ° UtPUt ra " geS £r °” ° to “**• - looses he 

transmission or density ^e^ 0 m^nH^atim S se™ta^. C ° U:W ^ **' 

it is multiplied hy^andls now 400 times'^ ls , a£! ”? d ln che computer 'PDP8) 
This is done to reduce the ef?«t n? T va£ue ShoWn 00 the panel » ■=«. 

ssr'kirr 

o :\zzi a p ^-uy r : arK 

ppsdddddppdddddn 

“ here s P i: P thrS?8 t s h Ln Pa Mr ltS a a ? Pend<!d t0 £1U tha 9 -" ack ‘ a Pe format, 
s is tne PDPo sign bit and is normally 0 

d represents one of the 10 data bits from the A/D converter 
n represents the noise bit position, normally 0. 

In reconstructing the microdensitometer data back into a useable form th 
gram allows the user two choices Rv , a useable form, the pro- 

storage type data n n n„ choices. By default, values will be produced fro a 
age type data. Optionally, actual panel display values may be generat >.d. 

la^l^fistd^ tfs U n1 £ ° r bulk 

rage. This is the form used by ERTS 7 eXaCtly 1 byte ° £ s, '°- 

System. y ’ LARbYS, and the Penn State Classification 


2 


The numeric range of the integer valued data is from 0 to 255. Approximate 
panel values may be derived by multiplying a storage value by .02. At first, 
it may seem that we are discarding valid data, but this is not so if we con- 
sider the accuracy of the microdensitometer. 

The microdensitometer specifies linearity of +.02 density or .5% transmission, 
and that the drift for a 10 hour period is less than +.02 density or less 
than 1% transmission. This means that a recorded value could differ from the 
true value by as much as .04 density or 1.5% transmission. The stored values 
will resolve density to the nearest .02 units and transmission to the nearest 
.4% (.3921569), which is within the limits of the equipment. 

The PanelData option allows the reconstruction of exact panel readings as 
shown by the panel display meter. The data accuracy implied is beyond the 
capability of equipment, but is should be useful in checking machine specifi- 
cations . 


3 


X Y COORDINATE SYSTEM 


The program assumes a generalized coordinate reference system. The x,y coor- 
dinates are signed integers* with (0,0) as the default origin. The x ordinate 
is the element index, and the y ordinate is the line index. The program always 
assigns the algebraically smallest x,y value to the pixel in the northwest 
comer (upper left). The x ordinate increases as the scan moves to the east 
(right), and the y ordinate increases as the lines move south (down). 

The PDS microdensitoraeter normally scans lines in a raster (back & forth) with 
the direction of scan alternating, and can scan lines from top to bottom or 
bottom to top. The Photometric Data System Conversion to Microdensitometer 
Scan (PDSCMS) program has the ability to determine the scanning directions, 
and use this in the coordinate assignment algorithm. Thus, regardless of how 
the points are scanned, the above defined coordinate reference system is valid. 

The program computes the coordinates during the combine phase. The coordinates 
of the physically first point are computed and assigned to that point. If this 
point is not the northwest corner point, the coordinate of the northwest corner 
point are derived. The program prints out the northwest corner coordinates as 
the first x and y ordinates. 

The above described coordinate reference system may seem unduly complicated, 
but it (1) sets up a reference system that is both hardware and software compa- 
tible, and (2) permits full use of the microdensitometer scanning ability. 

Display devices such as line printers and CRD devices, display data from left 
to right and top to bottom. The natural order of computer indexing is from 
smallest to highest. Thus, after coordinates are assigned, data points may be 
sorted by coordinate and they will be in the natural order for computer pro- 
cessing regardless of how scanned. 

The user may have several scans from a scene with the microdensitometer defin- 
ing the origin at each pisect. The conversion software would call that point 
(0,0) by default. Later, the user may wish to restore or assign relative 
position of pisects by relocation. The user could also move the origins of 
all pisects from the raicrodensitometer (0,0) setting to any arbitrary point 
(n,n) . 

The user may have the microdensitometer scan several pisects from a scene 
relative to a common origin. The conversion software will compute initial 
coordinates for each pisect using the microdensitometer supplied locations. 
Thus, the resulting pixel coordinate will preserve the relative spatial loca- 
tion of the pisects relative to the scene origin. Later, the user may wish 
to perform an origin transformation, and spatially locate this scene relative 
to any other independently scanned scene. 


4 


SAS OBSERVATIONS 


Each observation produced has 11 items as follows: 


SCENE-NAME 


1-8 characters 
5-12. 


left justified with trailing blanks in bytes 


PISECT-NAME 


GROUP-NAME 


This name is used to identify a collection of pisects (picture 
sections). If the user fails to supply a valid name, the pro- 
gram will use the current date in the form mm/dd/yy by default. 

1-8 characters left justified with trailing blanks in bytes 


This name is used to identify a pisect within a scene. A new 
name is supplied for each pisect processed. If the user fails 
to supply a valid name, the program will use the current value 
of the system clock in the form lih»mm»ss by default. 

1-8 characters left justified with trailing blanks in bytes 


IDENT-NAME 


This name is used to identify calibration data. A null or 
'blank' name indicates unknown data. The discriminant func- 
tion, uses named groups as training, and classifies unknown 
data. If the user fails to supply a valid name, the program 
supplies the null or 'blank' name by default. 

1—8 characters left justified with trailing blanks in bytes 


This name is used to establish user identity of unknown dal a. 

A null or blank' name indicates that the user does not kn< w or 
cannot identify the item. Valid ident-names are taken from the 
set of group names. The discriminate function would use tl e 
ident-name to check classification accuracy. Of the user i ails 
to provide a valid name, the program supplies the null or blank' 
name by default. 

X0R D integer binary in bytes 37-40. 

This is the relative position of the SAS observation withii. a 

line of data. It always gives relative element position within 
its own pisect, and depending on user options may be positional 
relative to an entire scene or group of scenes. 

Y0R D integer binary in bytes 41-44. 


This is the relative line position of the SAS observation. It 


5 


i 


i 


■i 


i .. 


PSN 


PIXF1V 


PIXF2V 


PIXF3V 


PIXF45V 


always gives relative line position within its own pisect, and 
depending on user options may be positional relative to an 
entire scene or group of scenes. 

integer binary in bytes 45-48. 

This is the pixel serial number assigned by the program. Pixels 
are serialized in order processed in the combine phase. Unless 
directed otherwise, pixels are serialized for the entire run 
starting with 1. The serial number may be signed. 

real binary in bytes 49-52. 

This is the microdensitometer value for the first scan read for 
the current pisect. It will never be assigned the missing 
value. 1 J 

real binary in bytes S3-56. 

This is the microdensitometer value for the second scan read 
in for the current pisect. If there was no second scan, it 
takes on the missing value. 

real binary in bytes 57-60. 

This is the microdensitometer value for the third scan read in 
for current pisect. If there was no third scan, it takes on the 
missing value. 

real binary in bytes 61-64. 


This is the microdensitometer value for the fourth scan read in 
for the current pisect. If there was no fourth scan, it takes 
on the missing value. 

The program writes the SAS compatible file in binary (unformatted) variable 
blocked spanned mode. (RECFM-VBS) . Because SAS includes the record descrip- 
tion word as part of the record, the byt locations of all items have been 
offset by 4 bytes in the above description. 


1/ 

The missing value is a floating point -0, or in hexidecimal 80000000. 


RESPROVsWW'i- 

ORIGINAL r. 


■Jih 




CONTROL CARDS 


The program uses 14 different control cards. Most of them are optional 
because the program will supply default values when the user does not. Each 
control card is divided into 3 major fields as follows: (1) Key word or op- 

code in columns 1-8; (2.) parameter field in columns 11-50; and, (3) comments 
field in columns 51-80. 

There are 4 classes of control cards, depending on the kind of action to be 
performed. Each class is described separately below: 

Class 1 - Run Cards 

These cards set indicators that remain in effect for the duration of the 
run or until redefined during the run. All run cards are optional. 

EDGE Card 

cols 1-8 EDGE 

This card causes the program to convert to all scans to a 
left edge scan. This effectively removes the raster produced 
by the back and forth microdensitometer scanning motion. All 
lines are running from right to left are turned around. If 
an EDGE card is not supplied, it is assumed. 

ASIS Card 

cols 1-8 ASIS 

This card causes the program to accept the data points in the 
order scanned. However, the x,y coordinate assigned are 
computed based on line direction. If the pixels are sorted 
based on the x,y coordinates, a normal picture will be pro- 
duced. That is, the true northwest corner point has the 
algebraically smallest coordinates, and the southeast corner 
has the algebraically largest coordinates. If an ASIS card 
is not supplied, EDGE is assumed by default. 

ABL Card 

cols 1-8 ABL 

This card causes the program to accept microdensitometer data 
sets that have identified with blank or first character blank 
labels. By default such scans are rejected as a fatal error. 
Note that once turned on this option caniaot be rescinded 
during a computer run. 

7 






I 1 I i I I 1 


VALUE Card 

cols 1-8 VALUE 

cols 11-18 STORAGE 

PANEL 

This card allows the user to select the type of numeric 
values to produce for the SAS file. Storage values are 
normalized floating point integers, range 0 value 255. 
Panel values are also normalized floating point, but is 
the microdensitomete^r A/D converter output expressed as a 
display panel nutober. The range is 0.000 ^ values ■< 5.115, 
in increments of .005. A storage value is numerically 50 
times the panel value with the decimal fraction truncated. 

Class 2 - Scene Cards 

These cards set parameters that apply only to the scene about to be 
processed. They are automatically cleared to default values after a 
STACK control card. All scene cards are optional. 

SCENE Card 

cols 1-8 SCENE 

cols 11-18 1-8 character name left justified with trailing blanks used 

to identify a group of pisects. The contents of columns 
11-18 are placed in the scene-name field of the SAS compati- 
ble record. If the user does not make a scene, the program 
supplies the current date by default. 

PSN Card 

cols 1-8 PSN 

cols 11-15 signed integer constant starting serial number. 

This card can be used to extend the serialization of previous 
computer runs. If the user does not supply a starting serial 
number, a value of 1 will be used by default. The STACK con- 
trol card resets PSN to 1. 

ORIGIN Card 

cols 1-8 ORIGIN 

cols 11-15 signed integer constant x coordinate offset. 


8 


-if- •• 


cols 16-20 Signed integer constant y coordinate offset. 


° n r ° 1 card is used to Provide origin translation of 
h pisect processed. The coordinates of the first point 
are computed and the offset applied. It may be used to 
elate the pisects from the current scene to those in a 
previous or subsequent scene. This feature may be useful 

;ho e togr h a e Ph d ;: a are from sequent±ai ~~ 

Class 3 — Pisect Cards 

cessed Car Th S6t parameters that a PPiy on ly to the pisect about to be pro- 
COMBINE “SS'SJ? t0 defaUlt Val “ eS after tha 


All pisect cards are optional. 


PISECT Card 


cols 

1-8 

cols 

11-18 

CROUP Card 

cols 

1-8 

cols 

11-18 

IDENT 

Card 


cols 1-8 
cols 11-18 


PISECT 


1-8 character name left justified with trailing blanks. 

r ° f U " 18 are SaVed in the pise ^ name 

in the SAS Compatible Record. It serves to identify pisects 

w in scenes. If the user does not supply a PISECT card 
default 8 ^ 111 USCS the CUrrent value of the system clock ly’ 


GROUP 

1-8 character name left justified with trailing blanks. 

f ° f C ° 1UmnS 11 " 18 are placed on ^e group field 

that this J° mpatible Pecord * A non-blank name indicates 
that this pisect contains calibration data for a specific 

insert <= a US ® r n0t SUpply 3 group name ’ the I rogra, 

inserts a blank name by default. 6 


IDENT 

1-8 character name left justified with trailing blanks. 

lltlTnTT ?L C e 1UranS l1 "" 18 are placed in «« Ment-name 

he S t h Corapatible R ecord. A non-blank name indicat 
that the user has identified the points in this pisect as be- 
longing to the specified group. If the user does not supply 


an IDENT, the program inserts blanks by default. 


RELOCATE Card 


cols 1-8 RELOCATE 

cols 11-15 signed integer constant representing the northwest x ordinate. 

cols 16-20 signed integer constant representing the northwest y ordinate. 

The northwest corner pixel will be assigned the coordinates 
given on this card. All subsequent pixels will be assigned 
coordinates relative to these. Thus, any pisect can be arbi- 
trarily moved in space. By default, absolute relocation will 
not be performed. 

This card overrides the origin transformation in effect for 
each pisect for which relocation is performed. The origin 
transformation will be performed for each pisect not relocated, 

Class A - File Manipulation Cards 

These control cards cause data to be moved from one file to another, and 
to perform some transformations in the process. These cards are required 
as specified below. 


READ Card 
cols 1-8 
cols 11-50 


READ 

1-40 character name left justified with trailing blanks. 

This card causes the program to read in 1 PDS microdensitometer 
scan, stored on a temporary file. One read card is required 
for each scan to be included in a SAS observation. When a 
read card is processed, while the program is in the parameter 
phase, control is switched to the read phase. No more para- 
meter cards will be honored until control reverts back to 
the parameter phase. 

Up to 4 consecutive read cards will be honored. If a 5th 
read card is encountered, the program will combine the 4 scans 
already stored on temporary files, and then scan the remaining 
control cards for errors. No more data will be transferred. 
Either an end-of-file, a combine card, or a stack card must 
follow read cards. 


10 


The 1-40 character name Is used for label checking as fol- 


/ 


(D If the name is absent or begins with a blank, the pro- 
gram assumes that no label checking is to be performed, 
and whatever file it finds is assumed to be cor reel;. 

(2) If a name is present, it must match the label put in 

the scan line by the microdensitometer operator. Label 
checking is performed up to the first blank character 
in the supplied name. Thus, if the user has a common 
prefix for a series of scans, he may use an abbreviated 
label to verify that the correct scans are being pro- 
cessed. If the label check fails, no more files are 

processed, but the remaining control cards are checked 
for errors. 

COMBINE Card 


cols 1-8 COMBINE 


This card causes the program to combine the results of the 
previous reads and add the results to the SAS compatible 
data set being built. If n scans are being combined, exactly 
n-1 combine cards are required. The last combine card in the 
control card stream is optional as any uncombined reads are 
automatically combined at end-of-file. At the end of a com- 
bine operation, the program returns to the parameter phase 
and will accept parameter control cards. 


STACK Card 


cols 1-8 


STACK 


This card is the same as combine in that the results of 
the previous reads are combined and concatenated to the 
SAS compatible data set being built. In addition, the data 
set is endfiled and the scene and pisect indicators cleared 
to default values. Any control statements following a 
STACK control card cause PDSCMS to start a new SAS compati- 
le file. This new file may be stacked or separated, depend- 
ing on the JCL used for the run. 


Both STACK and COMBINE cards may be used in the same run 
providing at least 1 read operation is performed between’ 
them. If a STACK card would be the physically last control 
card, it can be omitted. 




S^. 


11 


EXECUTING THE PDSCMS PROGRAM 


The PDSCMS program is executed by using the RADLGO procedure. The PDS micro- 
densitometer tape is read from unit 8, and the converted file is written on 
unit 9. Program control cards are read from SYSIN. 

The raicrodensitometer output is a series of stacked data sets on magnetic 
tape. The program reads as many data sets from the stack as directed by READ 
control cards by incrementing the unit 9 FORTRAN Sequence Number. Each READ 
control card requires a unit 9 DD JCL statement with an appropriate sequence 
number. The data set sequence number in the labels parameter points to the 
particular scan to be processed by the READ command. 


//FT08F001 

DD 

LABEL- (i,NL, ,IN) 

for 

first READ card 

//FT08F002 

DD 

LABEL- ( j , NL , , IN) 

for 

second READ card 

//FT08F003 

• 

• 

DD 

LABEL- (k,NL, ,IN) 

for 

third READ card 

• 

• 

//FT08Fnnn 

DD 

LABEL- (n,NL,, IN) 

for 

nnn ! th READ card 

The letters i, j. 

k, 

nnn represent the 

data set sequence number of the tape 


file to be processed. They point to the i'th, j'th, k’th, and n'th data 
set respectively. 


The converted SAS file is written on Unit 9 in FORTRAN binary (unformatted) 
mode as either a single data set or a series of stacked (separated) data sets. 
Stacking is performed by incrementing the Unit 9 FORTRAN Sequence Number. The 
DD statement parameters determine if stacking or separation is being performed. 


//FT09F001 

DD 

LABEL-p 

/ /FT09F002 

DD 

LABEL=q 

/ /FT09F003 

• 

• 

DD 

LABEL-r 

■ 

• 

/ /FT09 Fmmm 

DD 

LABEL-s 


initial output from PDSCMS 
after the first STACK card 
after the second STACK card 


after the (mram=l) , th STACK card. 


The letters p, q, r, s represent the data set sequence number on the tape 
being produced. If the data sets were being written on disk, separated names 
would be required. 


REPBOPO* 

ORIGINAL 


12 


SAMPLE JCL USING TAPE INPUT & OUTPUT 


//XO EXEC RADLGO, 

// P-PDSCMS 

//G0.FT08F001 DD DISP-OLD, UNIT-2400, DCB-(BLKSIZE=6400,RECFM=U,BUFNO=1) , 
/ / VOL- SER»URxxxx , 

// LABEL- (i,NL,, IN) 

/ /GO . FT08F002 DD DISP-) LD , UNIT-2400 , DCB-* ,FT08F001 , VOL-REF-* . FT08F001 , 
// * LABEL- (j,NL,, IN) 


as many ft08fyyy dd statements as required: extra ones do no harm. 


//GO. FT08Fnnn 
// 

/ /GO. FT09F001 


// 

// 

//GO.FT09F002 

// 

// 


DD DISP-OLD , UNIT-2400 ,DCB=* , FT08F001 , VOL-REF-* . FT08F001 
LABEL- (k,NL,, IN) 

DD DISP(,PASS),UNIT-2400,DCB=(BLKSIZE=6400,LRECL-32000,RECFM=VBS, 

BUFNO-1) , 

DSN-dsname, 

LABEL- (p,,, OUT) 

DD DISP- (OLD, PASS) , UNIT-2400, DCB-*. FT09F001, VOL-REF-* ,FT09F001, 
DSN-* . FT09F001, 

LABEL- (q,,, OUT) 


as many FT09Fyyy dd statements as required: extra ones do no harm. 


/ /GO. FT09Fmmm DD DISP- (OLD, PASS) , UNIT-2400, DCB-* .FT09F001,VOL«REF=* .FTQ9F001, 
// DSN-*.FT09F001 

// LA3EL=(r,,,0UT) 

PDSCMS control cards 

/* EOJ. 


13 


SAS PROCESSING THE COMPATIBLE FILE 


JCL Requirements 


In order to process the compatible file with the SAS program, an additional 
DD statement is required by the RADSAS procedure. This statement is requ re 
to point to the file to be used. In the following JCL, the PDSFILE _ DD state- 
ment is used to gain access to the converted PDS data. 


//S EXEC RADSAS 

//PDSFILE DD DSN“dsname,DISP“OLD,UNIT*2400,VOL“SER“xxxxxx 

//SYS IN DD * 

a 

. sas program statements 
/* EOJ. 

In the above example, the converted file is assumed to reside on magnetic tape 
as a single unstacked data set. If the file is not on magnetic tape, or is 
passed from a previous job step, an appropriate alteration in the PDSFILE m 

statement will be required. 

If the stack option has been used to stack or separate scan pictures, a separate 
DD statement is required for each stacked data set to be read in during a given 
SAS run. If the data sets are stacked on tape, extra DD statements may be left 
in the job stream whether needed or not. The following JCL illustrates the set 
up the stacked data sets on tape. 


//ST EXEC RADSAS 

//STACKl DD DISP=OLD,UNIT=2400, 

/ / DSN=dsname , VOL= ( , RETAIN , SER=xxxxxx) , 

// LABEL=p 

//STACK2 DD D ISP=OLD , UN IT=> 2400 , D SN“* . STACKl , V 0L= ( , RETAIN , REF 1 ** . STACKl 
// LABEL=q 

las many stack DD statements as may be needed: extra ones do no iarm. 
//STACKn DD DISP*OLD,UNIT=2400,DSN* ! * .STACKl, VOL^C, RETAIN, REF= . STACKl , 

/ / LABEL=r 

//SYSIN DD * 

Isas program statements 


/* EOJ. 


The user may substitute any name for PDSFILE, but that name must also be 
used in the SAS INPUT statement. 


If the converted files are separated on disk, a file must exist for each OD 
statement in the SAS step. If both PDSCMS and SAS are executed in the sane 
job, SAS DD statements may point to PDSCMS DD statements that were not us ad 
in the PDSCMS step. However, if the SAS program is run as a separate job, all 
the converted files referred to by DD statements must actually exist in order 
to prevent JCL errors. ^ 

| 

SAS Program Statements 

The SAS program must be directed to use the PDSFILE DD Statement for its Lnput. 
The model statements given below can be used to read in all the items fro a the 
converted file. 

DATA; 

INPUT DDNAME=PDSFILE SCENE $ 5-12 PISECT $ 13-20 GROUP $ 21-28 
IDENT $ 29-36 XORD IB 37-40 YORD IB 41-44 PSN IB 45-48 
PIXF1V RB 49-52 PIXF2V RB 53-56 PIXF3V RB 57-60 

The user may not wish to read in all the items. Those items not wanted ro y 
be omitted from the list in the input statement. The following statement 
shows how to read in only the data from the first and third read cards. 

DATA; 

INPUT DDNAME=PDSFILE PIXF1V RB 49-52 PIXF3V RB 57-60; PIXF4V RH 61-64; 

In order to read stacked or separated data sets in to the SAS system, the 
user must provide a separate INPUT statement for each separated file. Each 
data set referred to by the INPUT processor must actually exist. Data set s 
that do not exist or have never been created cause SAS to abend. 

The following example illustrates a simplified method of reading redundant 
type data sets by using a SAS macro. 

MACRO WHATEVER SCENE $ 5-12 PISECT $ 13-20 GROUP $ 21-28 

IDENT $ 29-36 XORD IB 37-40 YORD IB 41-44 PSN IB 45-48 

PIXF1V RB 49-52 PIXF2V RB 53-56 PIXF3V RB 57-60 PIXF4V RB 61-( 4% 

(other SAS statements could also be included In the macro to perform 
special transformations, range checks, etc.) 

DATA STK1; INPUT DDNAME= STACK1 WHATEVER; 

DATA STK2; INPUT DDNAME* STACK2 WHATEVER; 

• v 

. as many statements as required: extra ones must be removed. 

DATA STKn; INPUT DDNAME=STACKn WHATEVER; 


15 




rffriiltnifi'- 


I 




The PDSCMS program assigns the missing value to the PIXFiV elements for which 
there was no corresponding read card. The user can do 1 of 4 things with 
missing value: (1) accept data with missing values and let SAS handle them, 

(2) do not read in the pixel filter values that are missing, (3) convert the 
missing value to some neutral value, or (4) identify and take special action 
for missing items. 

Sample Program to Convert Missing Values 

PIXF2V-PIXF2V+0; 

PIXF3V-PIXF3V+0; 

PIXF4V-PIXF4V+0; 


16 


DATA CONVERSION 


Microdensitometer data is expected to be used from a storage format which is 
an 8 bit integer value from 0 to 255 inclusive. Storage data can either 
represent densities (logarithmic response), or transmission (linear response). 
Simple linear transformations are required to reduce storage values 
into the corresponding panel meter value, optical density, or percent trans- 
mission. 


Storage values can be converted directly into corresponding panel meter 
values by multiplying by .02. 1 / The resultant is either an optical density 
or transmission value, depending on the microdensitometer calibration settings 
when the scan was performed. 

When the microdensitometer is calibrated to record densities, the panel value 
is optical density. Storage values are Increments of .02 density units with 
a valid range from 0.00 to 4.00 inclusive. Density readings larger than 4.00 
constitute an overflow condition because they are beyond the specified range 
of the equipment. 

When the microdensitometer is recording transmissions, the stored data repre- 
sents an incremental percent transmission that is dependent on the gain set- 
ting during calibration. Normally, the gain is set at 5.10 to give maximum 
range and accuracy to the transmission levels. The incremental step is then 
.3921569% transmission. 


In addition, it may be useful to convert the storage data into, from logarithmic 
densities into linear transmissions and vice versa. In the following relation- 
ships, the transmission calibration (Gain) is assumed to be 5.10. The density 
is always calibrated to 0. 


The following symbols are used in the equations that follow. 

SD density (logarithmic) storage value 0 <_ SD <_ 200 

ST Transmission (linear) storage value 0 <_ ST <_ 255 

G Gain setting for transmission nominal value 5.10 

PT Percent transmission 0 <_ PT <_ 100 

OD Optical density 0< OD < 4.00 


y 

Described in the numeric representation section. 


17 


The relationship between optical density and transmission is: 

Density = -Log n (1/Transmission) 

If we impose on this basic relationship, the requirement that 100% transmission 
is 0 density and 0% transmission is A. 00 density, the equation can be rewritten 
as : 

OD = 2 = log (PT) 

10 

PT * 10 ** (2 - 0D) 

Note that the relationship of 0% transmission •» 4.00 optical density requires 
a mathematical impossibility, namely ^-°6 ^q( 0) * “2, and 10“ 2 = 0. These con- 
ditions are definitional and are imposed by the resolution limits of the elec- 
tronic circuiting in the microdensitometer. During computer processing this 
limiting point requires special handling. Computationally, the valid conver- 
sion ranges for percent transmission and optical density are: 

0 < PT <_ 100 
4.00 > OD >_ 0 

Also, be aware that 4.00 optical density can be transformed into the compu- 
tationally valid percent transmission value .01. If storage transmissions 
are being produced, the minimum storage value is .39% and Is larger than 01. 

An attempt to produce a storage value for .0% transmission will result in a 0 
value. 

Because in the density to transmission, computations can be performed over the 
entire density range, it is possible to computationally extend the valid trans- 
mission range beyond 2.3 optical density. An image is digitized in densities 
and the corresponding percent transmission computed. Thus, a percent transmis- 
sion values less than .30, can be used in computations, but cannot be produced 
by the microdensitometer, nor stored in standard form. 

The equation to convert stored density data into optical density is : 

OD » SD * .02 

The equation to convert stored transmission data into percent transmission is: 

PD * ST * .3921569 when G - 5.10 

PT * ST * (2/G) 0 < G < 5.10 

The following transformations are used to convert logarithmic values into linear 
values and vice versa. 


18 


To convert stored density into percent transmission use: 

PT ■ 10 **(2 - ST)* .02) 

To convert stored density into stored transmission use: 

ST - 10 **(2.40654 - (SD *.02)) G - 5.10 implied 

To convert Optical Density into stored transmission use: 

ST - 10 **(2.40654 - OD) G - 5.10 implied 

To convert stored transmission into optical density use: 

SD - (2 - log 10 (ST *.3921569)) *50 G - 5.10 
SD - (2 - log 1() (ST *(2/G)))*50 0 < G < 5.10 

To convert percent transmission into stored density use: 

SD - (2 - log 10 (PT))*50 


1 


19 




APPENDIX F 


Field Extraction Program 


Version 1 and 2 


Introduction: 


This program generates SAS program statements and control cards for PDSCTS to 
acilitate conversion and identification of microdensitometer data into I inal 
form suitable for discriminant analysis. It Is a special purpose prograr 
with few options and little in the way of error checking. It is the user’s 
responsibility to make certain the input data Is in the correct form, as 
escribed in the input section. There are two versions of the program. The 
major difference between the two versions is the input required for each. Thus, 
the input section of this paper is divided into two sections, one for version 
1, and the other for version 2. Any other differences between versions v ill be 

noted in the appropriate sections. The output from the two versions is iden- 
tical. 


JCL Requirements : 


//jobcard 

/♦ROUTE PUNCH LOCAL 
// EXEC RADGO, 

// P-RSFEP1 
// P-RSFEP2 
//GO.FT08F001 DD 
//GO.FT09F001 DD 
//GO. SYS IN DD * 


THIS CARD FOR VERSION 
THIS CARD FOR VERSION 


SYSOUT-B 

SYSOUT-A 


1 . 

2 . 


{ input cards } 
/* EOJ. 


Output: Output is routed through logical units 6, 7, 8, and 9. Logical units 

6 and 9 are for printed output, units 7 and 8 for punched output. 

The printed output on unit 6 consists of job processing information 
and Images of PDSCMS control cards. The PDSCMS control cards are 
punched from unit 7. SAS program statements for field extracticn 
are punched from unit 8 and printed from unit 9. 

PDSCMS Control Cards : 

1. SCENE state name ( 8 character maximum) 

2. PISECT segment number (4 characters) 

3. READ label for clear filter 

4. READ label for red filter 

5. READ label for green filter 
READ label for blue filter 



6 . 



7. STACK or see PDSCMS i*. 1.0 for 
COMBINE effects of each. 

Two sets of control cards are punched for each segment; one set for data 
scanned In the density mode, one set for data scanned In the transmisslor 
mode. The labels on the READ cards will match the Identification label cn 
the microdensitometer tape only if those identification labels are in the fol- 
lowing form: 

col 1: ’C\ 'R', ’G', or 'b' corresponding to the filter in use. 

col 2-5 : segment number 

col 6-8: first three characters of state name 

col 9-11: Julian date of photography 

col 12—14: 'DEN' or 'TRA' corresponding to scanning mode 
col 15-40: any other information desired by the user 

Input : The first data card is the same for both versions. It must be in the 

following form: 

col 1: blank 

col 2-23: &NUMBER OFSEGS«xx,&END 

where xx - the number of segments to be processed. 

For each segment, the following cards are required: 

VERSION 1 : 

1. State, segment number, and STACK or COMBINE 

Col. 1-12: state name, left justified. 

The first three characters are used to create lab il 
information for READ cards for PDSCMS. The state name 
is also output as the SCENE identifier for PDSCMS. 

Col. 15-18: segment number. 

Used for label information for READ cards , and ou .put 
as PISECT identifier. 

Col. 21-28: STACK or COMBINE (See PDSCMS 2.1.0 for effect of »;ach). 

2. Scanning information in the following form: 

Col 1: W«nk 

Col 2-54: dl!<F0 PHOTO«www , D ELT AX- , xxx , D ELTAY - yy y , NOF LDS ■ z z z &END 






2 


where www - Julian date of photography 
xxx - delta x for scanning 
yyy - y step for scanning 

zzz - number of fields in this segment to be 
processed. This must equal the number 
of FID cards for the segment. 


3. FID cards: corner coordinates, tract, field and crop identifiers. 


col 1-3: 
col 5-11: 
col 12-18: 
col 19-25: 
col 26-32: 
col 33-39: 
col 40-46: 
col 47-53: 
col 54-60: 
col 61-62: 
col 63-64: 
col 65-72: 


FID 

X 1 


*1 

x„ 


where (x^,Yj)«N.E. corner of field 

corner of field 
(x 3 ,y 3 )-S.E. corner of field 
( x ^,y^)*S.W. corner of field 


two digit integer corresponding to tract identification, 
two digit integer corresponding to field identification. 
8 character crop identifier. 


col 73-80: 8 character crop identifier 

The effect of each FID card is to create a SAS program statement 
which will append tract, field, and crop identifiers to most data 
points within the quadrangle specified by the corner coordinates 
on the FID card. Not all points will be identified since boundary 
points are deleted and the program operates only on rectangular 
areas parallel to the scanning axes which are contained within the 
specified quadrangle. The assumptions are also made: 

1. | min (x 2 ,x^)J |max (j^, x 3 > | 

2. | min (y 3 »y 4 )| |max (y^, y ) | 

3. (x^,y^) i “ 1, 2, 3, 4, are measured in microns. 


4. No origin offset will be used in PDSCMS. 


/ 


3 


I 


Restriction number 3 on the preceedlng page can be bypassed. If 
(x ,y 4 ) are in pixel coordinates as produced by PDSCMS, -.then specify 

DELTAX-1, DELTAY=1, on the scanning information card, rather than 
their true values. Irregular fields (non-rectangular) may be split 
by the user into two or more rectangular fields parallel to the scan- 
ning axes in order for the maximum number of points in the field to 
be identified. 

Version 2 : 

1. State, segment number, and STACK or COMBINE in same format as 
Version 1. 

2. Scanning information in the following form: 

col 1: blank 

col 2-64: &INFO PHOTO-www, DELTAX-xxx,DELTAY«yyy ,NOFLDS«zzz , 

NOPNTS“ttt , SEND 

where www, xxx, yyy, zzz are as defined in Version 1, 
and ttt is the number of comer points in the segment. 

3. Coordinates for each field corner point in the segment, 
col 4-10: 

col 14-20: y 

i 

col 24-27: i 

where (x i ,y i ) is the ith corner point in the segment. These 

cards must be in order from the smallest to largest i, where i = 

1,2,3,. . . ,n 

■ 

4. SFID cards: subscript of corner points, tract, field and crop 

identifiers. m 

col 1-4: SFID 

where (xj_, 

(*1 1 

(xj, 

<*l» 

tract identifie 

col 35-36: integer field identifier 

col 39-46: eight character crop identifier 

col 49-56: eight character crop identifier 


col 11-13: T 
col 16-18: jl 

col 21r-23: kf 

col 26-28: 1 

col 31-32 : integer 


y^) - N.W. corner 
y^> ■ N.E. corner 
y£) - S.W. corner 
yjL> • S.E, corner 


4 


Implementation of Version 2 considerably reduces setup time for 
scanning, and time required to record field identification for key- 
punching. By entering the E command on the microdensitometer, then 
positioning the stage at a field corner point and depressing the 
PROG INIT button, the coordinates of that point are printed out on 
the teletype. Field corner point coordinates can then be keypunched 
directly from the teletype printout. On the sketch of the segment, 
it is no longer necessary to record the coordinates for that point, 
merely record the subscript for that point. Then on the SFID key- 
punch form, it isonly necessary to record the subscript for each corner 
point, not the full set of coordinates. This should reduce the man- 
hours required for each of these steps by better than 50%. 


